TY  - JOUR
T1  - Improve Image Classification Using Data Optimization
AU  - Berrabah, Djamel
AU  - Gafour, Yacine
PY  - 2023
DA  - December
DO  - 10.55549/epstem.1409569
JF  - The Eurasia Proceedings of Science Technology Engineering and Mathematics
JO  - EPSTEM
PB  - ISRES Publishing
WT  - DergiPark
SN  - 2602-3199
SP  - 262
EP  - 271
VL  - 26
LA  - en
AB  - Image classification is a fundamental task in machine learning that involves assigning labels or classes to images based on their content. It is often performed using convolutional neural networks (CNNs). These networks are capable of learning and generalizing patterns from large amounts of data. However, if the data is not sufficiently voluminous, overfitting can occur. In such cases, it is recommended to turn to classical machine learning techniques. Moreover, the data that was insufficient for deep learning may exceed the processing capacity of the machine. This can pose significant challenges in terms of storage, memory availability, and computational power required to perform the learning operations. Our proposed approach involves addressing these challenges by optimizing the content of the dataset. This optimization is performed while preserving the essential information necessary for classification. Indeed, identical or highly similar are identified, grouped together and represented by the most representative one among them. At the same time, their sizes can be reduced. Furthermore, another significant challenge in our proposed approach revolves around managing class imbalances within the dataset. Our approach has been evaluated and the results are promising.
KW  - Unsupervised linear/non-linear dimensionality reduction
KW  - data visualization technique unsupervised learning algorithm
KW  - dataset optimization
CR  - Comon, P. (1994) Independent component analysis: A new concept?. Signal Processing, 36(3), 287-314.
CR  - Cox, T.  &amp; Cox M., (1994). Multidimensional scaling. London: Chapman &amp; Hall.
CR  - Dutta, S., &amp; Ghosh, A. K. (2016) On some transformations of high dimension, low sample size data for nearest neighbor classification. Mach Learn, 102, 57–83.
UR  - https://doi.org/10.55549/epstem.1409569
L1  - https://dergipark.org.tr/en/download/article-file/3618990
ER  -