EN
Improve Image Classification Using Data Optimization
Abstract
Image classification is a fundamental task in machine learning that involves assigning labels or classes to images based on their content. It is often performed using convolutional neural networks (CNNs). These networks are capable of learning and generalizing patterns from large amounts of data. However, if the data is not sufficiently voluminous, overfitting can occur. In such cases, it is recommended to turn to classical machine learning techniques. Moreover, the data that was insufficient for deep learning may exceed the processing capacity of the machine. This can pose significant challenges in terms of storage, memory availability, and computational power required to perform the learning operations. Our proposed approach involves addressing these challenges by optimizing the content of the dataset. This optimization is performed while preserving the essential information necessary for classification. Indeed, identical or highly similar are identified, grouped together and represented by the most representative one among them. At the same time, their sizes can be reduced. Furthermore, another significant challenge in our proposed approach revolves around managing class imbalances within the dataset. Our approach has been evaluated and the results are promising.
Keywords
References
- Comon, P. (1994) Independent component analysis: A new concept?. Signal Processing, 36(3), 287-314.
- Cox, T. & Cox M., (1994). Multidimensional scaling. London: Chapman & Hall.
- Dutta, S., & Ghosh, A. K. (2016) On some transformations of high dimension, low sample size data for nearest neighbor classification. Mach Learn, 102, 57–83.
Details
Primary Language
English
Subjects
Software Engineering (Other)
Journal Section
Conference Paper
Early Pub Date
December 25, 2023
Publication Date
December 30, 2023
Submission Date
June 13, 2023
Acceptance Date
November 21, 2023
Published in Issue
Year 2023 Volume: 26