As is well known, the quantity of labeled samples determines the success of a convolutional neural network (CNN). However, creating the labeled dataset is a difficult and time-consum-ing process. In contrast, unlabeled data is cheap and easy to access. Semi-supervised methods incorporate unlabeled data into the training process, which allows the model to learn from unlabeled data as well. We propose a semi-supervised method based on the ensemble ap-proach and the pseudo-labeling method. By balancing the unlabeled dataset with the labeled dataset during training, both the decision diversity between base-learner models and the in-dividual success of base-learner models are high in our proposed training strategy. We show that using multiple CNN models can result in both higher success and a more robust model than training a single CNN model. For inference, we propose using both stacking and voting methodologies. We have shown that the most successful algorithm for the stacking approach is the Support Vector Machine (SVM). In experiments, we use the STL-10 dataset to evaluate models, and we increased accuracy by 15.9% over training using only labeled data. Since we propose a training method based on cross-entropy loss, it can be implemented combined with state-of-the-art algorithms.
Ensemble Learning Pseudo Labeling Semi-Supervised Learning STL-10
Birincil Dil | İngilizce |
---|---|
Konular | Biyokimya ve Hücre Biyolojisi (Diğer) |
Bölüm | Research Articles |
Yazarlar | |
Yayımlanma Tarihi | 12 Haziran 2024 |
Gönderilme Tarihi | 19 Temmuz 2022 |
Yayımlandığı Sayı | Yıl 2024 Cilt: 42 Sayı: 3 |
IMPORTANT NOTE: JOURNAL SUBMISSION LINK https://eds.yildiz.edu.tr/sigma/