Araştırma Makalesi

Optimization Based Undersampling for Imbalanced Classes

Cilt: 11 Sayı: 2 31 Aralık 2021
PDF İndir
TR EN

Optimization Based Undersampling for Imbalanced Classes

Öz

The classification methods consider the probability of predicting the majority class to be high when the number of class observations is different. To address this problem, there are some methods such as resampling methods in the literature. Undersampling, one of the resampling methods, creates balance by removing data from the majority class. This study aims to compare different optimization methods to determine the most suitable observations to be taken from the majority class while undersampling. Firstly, a simple simulation study was conducted and graphs were used to analyze the discrepancy between the resampled datasets. Then, different classifier models were constructed for different imbalanced data sets. In these models, random undersampling, undersampling with genetic algorithm, undersampling with differential evolution algorithm, undersampling with an artificial bee colony, and under-sampling with particle herd optimization were compared. The results were given rank numbers differing depending on the classifiers and data sets and a general mean rank was obtained. As a result, when undersampling, artificial bee colony was seen to perform better than other methods of optimization.

Anahtar Kelimeler

Kaynakça

  1. [1] Chen, L., Bao, L., Li, J., Cai, S., Cai, C., Chen, Z., An aliasing artifacts reducing approach with random undersampling for spatiotemporally encoded single-shot MRI, Journal of Magnetic Resonance, 237, 115-124, 2013.
  2. [2] Liu, B., Tsoumakas, G., Dealing with class imbalance in classifier chains via random undersampling, Knowledge-Based Systems, 192:105292, 2019.
  3. [3] Tomek, I., Two modifications of CNN, IEEE Transactions on Systems, Man, and Cybernetics, SMC-6 (11), 769-772, 1976.
  4. [4] Elhassan, T., Aljourf, M., Al-Mohanna, F., Shoukri, M., Classification of imbalance data using tomek link (t-link) combined with random under-sampling (rus) as a data reduction method, Global Journal of Technology and Optimization, S1, 2017.
  5. [5] Pereira, R.M., Costa, Y.M., Silla Jr, C.N., MLTL: A multi-label approach for the Tomek Link undersampling algorithm, Neurocomputing, 383, 95-105, 2020.
  6. [6] Devi, D., Purkayastha, B., Redundancy-driven modified Tomek-link based undersampling: A solution to class imbalance, Pattern Recognition Letters, 93, 3-12, 2017.
  7. [7] Wilson, D. L., Asymptotic properties of nearest neighbor rules using edited data, IEEE Transactions on Systems, Man, and Cybernetics, 3, 408-421, 1972. https://doi.org/10.1109/TSMC.1972.4309137.
  8. [8] Laurikkala, J., Improving identification of difficult small classes by balancing class distribution, In Conference on Artificial Intelligence in Medicine in Europe (pp. 63-66), Springer, Berlin, Heidelberg, 2001. https://doi.org/10.1007/3-540-48229-6_9.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Uygulamalı Matematik

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

31 Aralık 2021

Gönderilme Tarihi

21 Şubat 2021

Kabul Tarihi

25 Ekim 2021

Yayımlandığı Sayı

Yıl 2021 Cilt: 11 Sayı: 2

Kaynak Göster

APA
Sağlam, F., Sözen, M., & Cengiz, M. A. (2021). Optimization Based Undersampling for Imbalanced Classes. Adıyaman University Journal of Science, 11(2), 385-409. https://doi.org/10.37094/adyujsci.884120
AMA
1.Sağlam F, Sözen M, Cengiz MA. Optimization Based Undersampling for Imbalanced Classes. ADYU J SCI. 2021;11(2):385-409. doi:10.37094/adyujsci.884120
Chicago
Sağlam, Fatih, Mervenur Sözen, ve Mehmet Ali Cengiz. 2021. “Optimization Based Undersampling for Imbalanced Classes”. Adıyaman University Journal of Science 11 (2): 385-409. https://doi.org/10.37094/adyujsci.884120.
EndNote
Sağlam F, Sözen M, Cengiz MA (01 Aralık 2021) Optimization Based Undersampling for Imbalanced Classes. Adıyaman University Journal of Science 11 2 385–409.
IEEE
[1]F. Sağlam, M. Sözen, ve M. A. Cengiz, “Optimization Based Undersampling for Imbalanced Classes”, ADYU J SCI, c. 11, sy 2, ss. 385–409, Ara. 2021, doi: 10.37094/adyujsci.884120.
ISNAD
Sağlam, Fatih - Sözen, Mervenur - Cengiz, Mehmet Ali. “Optimization Based Undersampling for Imbalanced Classes”. Adıyaman University Journal of Science 11/2 (01 Aralık 2021): 385-409. https://doi.org/10.37094/adyujsci.884120.
JAMA
1.Sağlam F, Sözen M, Cengiz MA. Optimization Based Undersampling for Imbalanced Classes. ADYU J SCI. 2021;11:385–409.
MLA
Sağlam, Fatih, vd. “Optimization Based Undersampling for Imbalanced Classes”. Adıyaman University Journal of Science, c. 11, sy 2, Aralık 2021, ss. 385-09, doi:10.37094/adyujsci.884120.
Vancouver
1.Fatih Sağlam, Mervenur Sözen, Mehmet Ali Cengiz. Optimization Based Undersampling for Imbalanced Classes. ADYU J SCI. 01 Aralık 2021;11(2):385-409. doi:10.37094/adyujsci.884120