Araştırma Makalesi
BibTex RIS Kaynak Göster

BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION

Yıl 2025, Cilt: 9 Sayı: 2, 320 - 330, 30.08.2025
https://doi.org/10.46519/ij3dptdi.1703106

Öz

Class imbalance presents a persistent challenge in supervised learning, often degrading classifier performance on underrepresented classes. This study introduces BODE, a hybrid oversampling method that combines boundary-aware instance selection, differential evolution-based perturbation, and density-constrained filtering. By targeting critical minority instances near decision boundaries, BODE generates diverse yet structurally valid synthetic samples. Experiments on 44 benchmark datasets using k-NN, Decision Tree, and SVM classifiers demonstrate that BODE consistently outperforms eleven widely used oversampling methods. Evaluated solely using the AUC metric, BODE achieves the highest average performance across all classifiers, with 28, 33, and 26 dataset-level wins, respectively. These results confirm BODE’s robustness and generalization capability, particularly in challenging scenarios involving overlapping or sparse decision regions.

Kaynakça

  • 1. Leevy, J.L., Khoshgoftaar, T.M., Bauder, R.A. and Seliya, N., “A survey on addressing high-class imbalance in big data”, Journal of Big Data, Vol. 5, Pages 42, 2018.
  • 2. Kononenko, I., “Machine learning for medical diagnosis: history, state of the art and perspective”, Artificial Intelligence in Medicine, Vol. 23, Pages 89–109, 2001.
  • 3. Huang, S.Y., Lin, C.-C., Chiu, A.-A. and Yen, D.C., “Fraud detection using fraud triangle risk factors”, Information Systems Frontiers, Vol. 19, Pages 1343–1356, 2017.
  • 4. Jiang, Y., Cukic, B. and Ma, Y., “Techniques for evaluating fault prediction models”, Empirical Software Engineering, Vol. 13, Pages 561–595, 2008.
  • 5. Kovács, G., “SMOTE-variants: A Python implementation of 85 minority oversampling techniques”, Neurocomputing, Vol. 366, Pages 352–354, 2019.
  • 6. Korkmaz, S., Şahman, M.A., Cinar, A.C. and Kaya, E., “Boosting the oversampling methods based on differential evolution strategies for imbalanced learning”, Applied Soft Computing, Vol. 112, Pages 107787, 2021.
  • 7. Korkmaz, S., “Hybridization of DEBOHID with ENN algorithm for highly imbalanced datasets”, Engineering Science and Technology, an International Journal, Vol. 63, Pages 101976, 2025.
  • 8. Wang, X., Li, Y., Zhang, J., Zhang, B. and Gong, H., “An oversampling method based on differential evolution and natural neighbors”, Applied Soft Computing, Vol. 149, Pages 110952, 2023.
  • 9. Kaya, E., Korkmaz, S., Sahman, M.A. and Cinar, A.C., “DEBOHID: A differential evolution based oversampling approach for highly imbalanced datasets”, Expert Systems with Applications, Vol. 169, Pages 114482, 2021.
  • 10. Bunkhumpornpat, C., Sinapiromsaran, K. and Lursinsap, C., “Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for handling the class imbalanced problem”, Proceedings, Pages 475–482, 2009.
  • 11. He, H., Bai, Y., Garcia, E.A. and Li, S., “ADASYN: Adaptive synthetic sampling approach for imbalanced learning”, Proceedings of the 2008 IEEE International Joint Conference on Neural Networks, Pages 1322–1328, 2008.
  • 12. Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P., “SMOTE: Synthetic Minority Over-sampling Technique”, Journal of Artificial Intelligence Research, Vol. 16, Pages 321–357, 2002.
  • 13. Hairani, H., Anggrawan, A. and Priyanto, D., “Improvement performance of the Random Forest method on unbalanced diabetes data classification using SMOTE-Tomek Link”, International Journal on Informatics Visualization, Vol. 7, Pages 258, 2023.
  • 14. Nishat, M.M., Faisal, F., Ratul, I.J., Al-Monsur, A., Ar-Rafi, A.M., Nasrullah, S.M., Reza, M.T. and Khan, M.R.H., “A comprehensive investigation of the performances of different machine learning classifiers with SMOTE-ENN oversampling technique and hyperparameter optimization for imbalanced heart failure dataset”, Scientific Programming, Vol. 2022, Pages 1–17, 2022.
  • 15. Shen, Y., Wu, J., Ma, M., Du, X., Wu, H., Fei, X. and Niu, D., “Improved differential evolution algorithm based on cooperative multi-population”, Engineering Applications of Artificial Intelligence, Vol. 133, Pages 108149, 2024.
  • 16. Deng, W., Shang, S., Cai, X., Zhao, H., Song, Y. and Xu, J., “An improved differential evolution algorithm and its application in optimization problem”, Soft Computing, Vol. 25, Pages 5277–5298, 2021.
  • 17. Salgotra, R. and Gandomi, A.H., “A novel multi-hybrid differential evolution algorithm for optimization of frame structures”, Scientific Reports, Vol. 14, Pages 4877, 2024. 18. Chen, Y.-C., “A tutorial on kernel density estimation and recent advances”, Biostatistics & Epidemiology, Vol. 1, Pages 161–187, 2017.
  • 19. Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C. and Herrera, F., “KEEL: A software tool to assess evolutionary algorithms for data mining problems”, Soft Computing, Vol. 13, Pages 307–318, 2009.
  • 20. Chandra, M.A. and Bedi, S.S., “Survey on SVM and their application in image classification”, International Journal of Information Technology, Vol. 13, Pages 1–11, 2021.
  • 21. Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A. and Brown, S.D., “An introduction to decision tree modeling”, Journal of Chemometrics, Vol. 18, Pages 275–285, 2004.
  • 22. Guo, G., Wang, H., Bell, D., Bi, Y. and Greer, K., “KNN model-based approach in classification”, Proceedings, Pages 986–996, 2003.
  • 23. Huang, J. and Ling, C.X., “Using AUC and accuracy in evaluating learning algorithms”, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, Pages 299–310, 2005.

BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION

Yıl 2025, Cilt: 9 Sayı: 2, 320 - 330, 30.08.2025
https://doi.org/10.46519/ij3dptdi.1703106

Öz

Class imbalance presents a persistent challenge in supervised learning, often degrading classifier performance on underrepresented classes. This study introduces BODE, a hybrid oversampling method that combines boundary-aware instance selection, differential evolution-based perturbation, and density-constrained filtering. By targeting critical minority instances near decision boundaries, BODE generates diverse yet structurally valid synthetic samples. Experiments on 44 benchmark datasets using k-NN, Decision Tree, and SVM classifiers demonstrate that BODE consistently outperforms eleven widely used oversampling methods. Evaluated solely using the AUC metric, BODE achieves the highest average performance across all classifiers, with 28, 33, and 26 dataset-level wins, respectively. These results confirm BODE’s robustness and generalization capability, particularly in challenging scenarios involving overlapping or sparse decision regions.

Kaynakça

  • 1. Leevy, J.L., Khoshgoftaar, T.M., Bauder, R.A. and Seliya, N., “A survey on addressing high-class imbalance in big data”, Journal of Big Data, Vol. 5, Pages 42, 2018.
  • 2. Kononenko, I., “Machine learning for medical diagnosis: history, state of the art and perspective”, Artificial Intelligence in Medicine, Vol. 23, Pages 89–109, 2001.
  • 3. Huang, S.Y., Lin, C.-C., Chiu, A.-A. and Yen, D.C., “Fraud detection using fraud triangle risk factors”, Information Systems Frontiers, Vol. 19, Pages 1343–1356, 2017.
  • 4. Jiang, Y., Cukic, B. and Ma, Y., “Techniques for evaluating fault prediction models”, Empirical Software Engineering, Vol. 13, Pages 561–595, 2008.
  • 5. Kovács, G., “SMOTE-variants: A Python implementation of 85 minority oversampling techniques”, Neurocomputing, Vol. 366, Pages 352–354, 2019.
  • 6. Korkmaz, S., Şahman, M.A., Cinar, A.C. and Kaya, E., “Boosting the oversampling methods based on differential evolution strategies for imbalanced learning”, Applied Soft Computing, Vol. 112, Pages 107787, 2021.
  • 7. Korkmaz, S., “Hybridization of DEBOHID with ENN algorithm for highly imbalanced datasets”, Engineering Science and Technology, an International Journal, Vol. 63, Pages 101976, 2025.
  • 8. Wang, X., Li, Y., Zhang, J., Zhang, B. and Gong, H., “An oversampling method based on differential evolution and natural neighbors”, Applied Soft Computing, Vol. 149, Pages 110952, 2023.
  • 9. Kaya, E., Korkmaz, S., Sahman, M.A. and Cinar, A.C., “DEBOHID: A differential evolution based oversampling approach for highly imbalanced datasets”, Expert Systems with Applications, Vol. 169, Pages 114482, 2021.
  • 10. Bunkhumpornpat, C., Sinapiromsaran, K. and Lursinsap, C., “Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for handling the class imbalanced problem”, Proceedings, Pages 475–482, 2009.
  • 11. He, H., Bai, Y., Garcia, E.A. and Li, S., “ADASYN: Adaptive synthetic sampling approach for imbalanced learning”, Proceedings of the 2008 IEEE International Joint Conference on Neural Networks, Pages 1322–1328, 2008.
  • 12. Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P., “SMOTE: Synthetic Minority Over-sampling Technique”, Journal of Artificial Intelligence Research, Vol. 16, Pages 321–357, 2002.
  • 13. Hairani, H., Anggrawan, A. and Priyanto, D., “Improvement performance of the Random Forest method on unbalanced diabetes data classification using SMOTE-Tomek Link”, International Journal on Informatics Visualization, Vol. 7, Pages 258, 2023.
  • 14. Nishat, M.M., Faisal, F., Ratul, I.J., Al-Monsur, A., Ar-Rafi, A.M., Nasrullah, S.M., Reza, M.T. and Khan, M.R.H., “A comprehensive investigation of the performances of different machine learning classifiers with SMOTE-ENN oversampling technique and hyperparameter optimization for imbalanced heart failure dataset”, Scientific Programming, Vol. 2022, Pages 1–17, 2022.
  • 15. Shen, Y., Wu, J., Ma, M., Du, X., Wu, H., Fei, X. and Niu, D., “Improved differential evolution algorithm based on cooperative multi-population”, Engineering Applications of Artificial Intelligence, Vol. 133, Pages 108149, 2024.
  • 16. Deng, W., Shang, S., Cai, X., Zhao, H., Song, Y. and Xu, J., “An improved differential evolution algorithm and its application in optimization problem”, Soft Computing, Vol. 25, Pages 5277–5298, 2021.
  • 17. Salgotra, R. and Gandomi, A.H., “A novel multi-hybrid differential evolution algorithm for optimization of frame structures”, Scientific Reports, Vol. 14, Pages 4877, 2024. 18. Chen, Y.-C., “A tutorial on kernel density estimation and recent advances”, Biostatistics & Epidemiology, Vol. 1, Pages 161–187, 2017.
  • 19. Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C. and Herrera, F., “KEEL: A software tool to assess evolutionary algorithms for data mining problems”, Soft Computing, Vol. 13, Pages 307–318, 2009.
  • 20. Chandra, M.A. and Bedi, S.S., “Survey on SVM and their application in image classification”, International Journal of Information Technology, Vol. 13, Pages 1–11, 2021.
  • 21. Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A. and Brown, S.D., “An introduction to decision tree modeling”, Journal of Chemometrics, Vol. 18, Pages 275–285, 2004.
  • 22. Guo, G., Wang, H., Bell, D., Bi, Y. and Greer, K., “KNN model-based approach in classification”, Proceedings, Pages 986–996, 2003.
  • 23. Huang, J. and Ling, C.X., “Using AUC and accuracy in evaluating learning algorithms”, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, Pages 299–310, 2005.
Toplam 22 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Yazılım Mühendisliği (Diğer)
Bölüm Araştırma Makalesi
Yazarlar

Muhammed Abdulhamid Karabıyık 0000-0001-7927-8790

Yayımlanma Tarihi 30 Ağustos 2025
Gönderilme Tarihi 20 Mayıs 2025
Kabul Tarihi 27 Temmuz 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 9 Sayı: 2

Kaynak Göster

APA Karabıyık, M. A. (2025). BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. International Journal of 3D Printing Technologies and Digital Industry, 9(2), 320-330. https://doi.org/10.46519/ij3dptdi.1703106
AMA Karabıyık MA. BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. IJ3DPTDI. Ağustos 2025;9(2):320-330. doi:10.46519/ij3dptdi.1703106
Chicago Karabıyık, Muhammed Abdulhamid. “BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION”. International Journal of 3D Printing Technologies and Digital Industry 9, sy. 2 (Ağustos 2025): 320-30. https://doi.org/10.46519/ij3dptdi.1703106.
EndNote Karabıyık MA (01 Ağustos 2025) BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. International Journal of 3D Printing Technologies and Digital Industry 9 2 320–330.
IEEE M. A. Karabıyık, “BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION”, IJ3DPTDI, c. 9, sy. 2, ss. 320–330, 2025, doi: 10.46519/ij3dptdi.1703106.
ISNAD Karabıyık, Muhammed Abdulhamid. “BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION”. International Journal of 3D Printing Technologies and Digital Industry 9/2 (Ağustos2025), 320-330. https://doi.org/10.46519/ij3dptdi.1703106.
JAMA Karabıyık MA. BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. IJ3DPTDI. 2025;9:320–330.
MLA Karabıyık, Muhammed Abdulhamid. “BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION”. International Journal of 3D Printing Technologies and Digital Industry, c. 9, sy. 2, 2025, ss. 320-3, doi:10.46519/ij3dptdi.1703106.
Vancouver Karabıyık MA. BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. IJ3DPTDI. 2025;9(2):320-3.

 download

Uluslararası 3B Yazıcı Teknolojileri ve Dijital Endüstri Dergisi Creative Commons Atıf-GayriTicari 4.0 Uluslararası Lisansı ile lisanslanmıştır.