Research Article
BibTex RIS Cite

BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION

Year 2025, Volume: 9 Issue: 2, 320 - 330, 30.08.2025
https://doi.org/10.46519/ij3dptdi.1703106

Abstract

Class imbalance presents a persistent challenge in supervised learning, often degrading classifier performance on underrepresented classes. This study introduces BODE, a hybrid oversampling method that combines boundary-aware instance selection, differential evolution-based perturbation, and density-constrained filtering. By targeting critical minority instances near decision boundaries, BODE generates diverse yet structurally valid synthetic samples. Experiments on 44 benchmark datasets using k-NN, Decision Tree, and SVM classifiers demonstrate that BODE consistently outperforms eleven widely used oversampling methods. Evaluated solely using the AUC metric, BODE achieves the highest average performance across all classifiers, with 28, 33, and 26 dataset-level wins, respectively. These results confirm BODE’s robustness and generalization capability, particularly in challenging scenarios involving overlapping or sparse decision regions.

References

  • 1. Leevy, J.L., Khoshgoftaar, T.M., Bauder, R.A. and Seliya, N., “A survey on addressing high-class imbalance in big data”, Journal of Big Data, Vol. 5, Pages 42, 2018.
  • 2. Kononenko, I., “Machine learning for medical diagnosis: history, state of the art and perspective”, Artificial Intelligence in Medicine, Vol. 23, Pages 89–109, 2001.
  • 3. Huang, S.Y., Lin, C.-C., Chiu, A.-A. and Yen, D.C., “Fraud detection using fraud triangle risk factors”, Information Systems Frontiers, Vol. 19, Pages 1343–1356, 2017.
  • 4. Jiang, Y., Cukic, B. and Ma, Y., “Techniques for evaluating fault prediction models”, Empirical Software Engineering, Vol. 13, Pages 561–595, 2008.
  • 5. Kovács, G., “SMOTE-variants: A Python implementation of 85 minority oversampling techniques”, Neurocomputing, Vol. 366, Pages 352–354, 2019.
  • 6. Korkmaz, S., Şahman, M.A., Cinar, A.C. and Kaya, E., “Boosting the oversampling methods based on differential evolution strategies for imbalanced learning”, Applied Soft Computing, Vol. 112, Pages 107787, 2021.
  • 7. Korkmaz, S., “Hybridization of DEBOHID with ENN algorithm for highly imbalanced datasets”, Engineering Science and Technology, an International Journal, Vol. 63, Pages 101976, 2025.
  • 8. Wang, X., Li, Y., Zhang, J., Zhang, B. and Gong, H., “An oversampling method based on differential evolution and natural neighbors”, Applied Soft Computing, Vol. 149, Pages 110952, 2023.
  • 9. Kaya, E., Korkmaz, S., Sahman, M.A. and Cinar, A.C., “DEBOHID: A differential evolution based oversampling approach for highly imbalanced datasets”, Expert Systems with Applications, Vol. 169, Pages 114482, 2021.
  • 10. Bunkhumpornpat, C., Sinapiromsaran, K. and Lursinsap, C., “Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for handling the class imbalanced problem”, Proceedings, Pages 475–482, 2009.
  • 11. He, H., Bai, Y., Garcia, E.A. and Li, S., “ADASYN: Adaptive synthetic sampling approach for imbalanced learning”, Proceedings of the 2008 IEEE International Joint Conference on Neural Networks, Pages 1322–1328, 2008.
  • 12. Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P., “SMOTE: Synthetic Minority Over-sampling Technique”, Journal of Artificial Intelligence Research, Vol. 16, Pages 321–357, 2002.
  • 13. Hairani, H., Anggrawan, A. and Priyanto, D., “Improvement performance of the Random Forest method on unbalanced diabetes data classification using SMOTE-Tomek Link”, International Journal on Informatics Visualization, Vol. 7, Pages 258, 2023.
  • 14. Nishat, M.M., Faisal, F., Ratul, I.J., Al-Monsur, A., Ar-Rafi, A.M., Nasrullah, S.M., Reza, M.T. and Khan, M.R.H., “A comprehensive investigation of the performances of different machine learning classifiers with SMOTE-ENN oversampling technique and hyperparameter optimization for imbalanced heart failure dataset”, Scientific Programming, Vol. 2022, Pages 1–17, 2022.
  • 15. Shen, Y., Wu, J., Ma, M., Du, X., Wu, H., Fei, X. and Niu, D., “Improved differential evolution algorithm based on cooperative multi-population”, Engineering Applications of Artificial Intelligence, Vol. 133, Pages 108149, 2024.
  • 16. Deng, W., Shang, S., Cai, X., Zhao, H., Song, Y. and Xu, J., “An improved differential evolution algorithm and its application in optimization problem”, Soft Computing, Vol. 25, Pages 5277–5298, 2021.
  • 17. Salgotra, R. and Gandomi, A.H., “A novel multi-hybrid differential evolution algorithm for optimization of frame structures”, Scientific Reports, Vol. 14, Pages 4877, 2024. 18. Chen, Y.-C., “A tutorial on kernel density estimation and recent advances”, Biostatistics & Epidemiology, Vol. 1, Pages 161–187, 2017.
  • 19. Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C. and Herrera, F., “KEEL: A software tool to assess evolutionary algorithms for data mining problems”, Soft Computing, Vol. 13, Pages 307–318, 2009.
  • 20. Chandra, M.A. and Bedi, S.S., “Survey on SVM and their application in image classification”, International Journal of Information Technology, Vol. 13, Pages 1–11, 2021.
  • 21. Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A. and Brown, S.D., “An introduction to decision tree modeling”, Journal of Chemometrics, Vol. 18, Pages 275–285, 2004.
  • 22. Guo, G., Wang, H., Bell, D., Bi, Y. and Greer, K., “KNN model-based approach in classification”, Proceedings, Pages 986–996, 2003.
  • 23. Huang, J. and Ling, C.X., “Using AUC and accuracy in evaluating learning algorithms”, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, Pages 299–310, 2005.

BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION

Year 2025, Volume: 9 Issue: 2, 320 - 330, 30.08.2025
https://doi.org/10.46519/ij3dptdi.1703106

Abstract

Class imbalance presents a persistent challenge in supervised learning, often degrading classifier performance on underrepresented classes. This study introduces BODE, a hybrid oversampling method that combines boundary-aware instance selection, differential evolution-based perturbation, and density-constrained filtering. By targeting critical minority instances near decision boundaries, BODE generates diverse yet structurally valid synthetic samples. Experiments on 44 benchmark datasets using k-NN, Decision Tree, and SVM classifiers demonstrate that BODE consistently outperforms eleven widely used oversampling methods. Evaluated solely using the AUC metric, BODE achieves the highest average performance across all classifiers, with 28, 33, and 26 dataset-level wins, respectively. These results confirm BODE’s robustness and generalization capability, particularly in challenging scenarios involving overlapping or sparse decision regions.

References

  • 1. Leevy, J.L., Khoshgoftaar, T.M., Bauder, R.A. and Seliya, N., “A survey on addressing high-class imbalance in big data”, Journal of Big Data, Vol. 5, Pages 42, 2018.
  • 2. Kononenko, I., “Machine learning for medical diagnosis: history, state of the art and perspective”, Artificial Intelligence in Medicine, Vol. 23, Pages 89–109, 2001.
  • 3. Huang, S.Y., Lin, C.-C., Chiu, A.-A. and Yen, D.C., “Fraud detection using fraud triangle risk factors”, Information Systems Frontiers, Vol. 19, Pages 1343–1356, 2017.
  • 4. Jiang, Y., Cukic, B. and Ma, Y., “Techniques for evaluating fault prediction models”, Empirical Software Engineering, Vol. 13, Pages 561–595, 2008.
  • 5. Kovács, G., “SMOTE-variants: A Python implementation of 85 minority oversampling techniques”, Neurocomputing, Vol. 366, Pages 352–354, 2019.
  • 6. Korkmaz, S., Şahman, M.A., Cinar, A.C. and Kaya, E., “Boosting the oversampling methods based on differential evolution strategies for imbalanced learning”, Applied Soft Computing, Vol. 112, Pages 107787, 2021.
  • 7. Korkmaz, S., “Hybridization of DEBOHID with ENN algorithm for highly imbalanced datasets”, Engineering Science and Technology, an International Journal, Vol. 63, Pages 101976, 2025.
  • 8. Wang, X., Li, Y., Zhang, J., Zhang, B. and Gong, H., “An oversampling method based on differential evolution and natural neighbors”, Applied Soft Computing, Vol. 149, Pages 110952, 2023.
  • 9. Kaya, E., Korkmaz, S., Sahman, M.A. and Cinar, A.C., “DEBOHID: A differential evolution based oversampling approach for highly imbalanced datasets”, Expert Systems with Applications, Vol. 169, Pages 114482, 2021.
  • 10. Bunkhumpornpat, C., Sinapiromsaran, K. and Lursinsap, C., “Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for handling the class imbalanced problem”, Proceedings, Pages 475–482, 2009.
  • 11. He, H., Bai, Y., Garcia, E.A. and Li, S., “ADASYN: Adaptive synthetic sampling approach for imbalanced learning”, Proceedings of the 2008 IEEE International Joint Conference on Neural Networks, Pages 1322–1328, 2008.
  • 12. Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P., “SMOTE: Synthetic Minority Over-sampling Technique”, Journal of Artificial Intelligence Research, Vol. 16, Pages 321–357, 2002.
  • 13. Hairani, H., Anggrawan, A. and Priyanto, D., “Improvement performance of the Random Forest method on unbalanced diabetes data classification using SMOTE-Tomek Link”, International Journal on Informatics Visualization, Vol. 7, Pages 258, 2023.
  • 14. Nishat, M.M., Faisal, F., Ratul, I.J., Al-Monsur, A., Ar-Rafi, A.M., Nasrullah, S.M., Reza, M.T. and Khan, M.R.H., “A comprehensive investigation of the performances of different machine learning classifiers with SMOTE-ENN oversampling technique and hyperparameter optimization for imbalanced heart failure dataset”, Scientific Programming, Vol. 2022, Pages 1–17, 2022.
  • 15. Shen, Y., Wu, J., Ma, M., Du, X., Wu, H., Fei, X. and Niu, D., “Improved differential evolution algorithm based on cooperative multi-population”, Engineering Applications of Artificial Intelligence, Vol. 133, Pages 108149, 2024.
  • 16. Deng, W., Shang, S., Cai, X., Zhao, H., Song, Y. and Xu, J., “An improved differential evolution algorithm and its application in optimization problem”, Soft Computing, Vol. 25, Pages 5277–5298, 2021.
  • 17. Salgotra, R. and Gandomi, A.H., “A novel multi-hybrid differential evolution algorithm for optimization of frame structures”, Scientific Reports, Vol. 14, Pages 4877, 2024. 18. Chen, Y.-C., “A tutorial on kernel density estimation and recent advances”, Biostatistics & Epidemiology, Vol. 1, Pages 161–187, 2017.
  • 19. Alcalá-Fdez, J., Sánchez, L., García, S., del Jesus, M.J., Ventura, S., Garrell, J.M., Otero, J., Romero, C., Bacardit, J., Rivas, V.M., Fernández, J.C. and Herrera, F., “KEEL: A software tool to assess evolutionary algorithms for data mining problems”, Soft Computing, Vol. 13, Pages 307–318, 2009.
  • 20. Chandra, M.A. and Bedi, S.S., “Survey on SVM and their application in image classification”, International Journal of Information Technology, Vol. 13, Pages 1–11, 2021.
  • 21. Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A. and Brown, S.D., “An introduction to decision tree modeling”, Journal of Chemometrics, Vol. 18, Pages 275–285, 2004.
  • 22. Guo, G., Wang, H., Bell, D., Bi, Y. and Greer, K., “KNN model-based approach in classification”, Proceedings, Pages 986–996, 2003.
  • 23. Huang, J. and Ling, C.X., “Using AUC and accuracy in evaluating learning algorithms”, IEEE Transactions on Knowledge and Data Engineering, Vol. 17, Pages 299–310, 2005.
There are 22 citations in total.

Details

Primary Language English
Subjects Software Engineering (Other)
Journal Section Research Article
Authors

Muhammed Abdulhamid Karabıyık 0000-0001-7927-8790

Publication Date August 30, 2025
Submission Date May 20, 2025
Acceptance Date July 27, 2025
Published in Issue Year 2025 Volume: 9 Issue: 2

Cite

APA Karabıyık, M. A. (2025). BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. International Journal of 3D Printing Technologies and Digital Industry, 9(2), 320-330. https://doi.org/10.46519/ij3dptdi.1703106
AMA Karabıyık MA. BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. International Journal of 3D Printing Technologies and Digital Industry. August 2025;9(2):320-330. doi:10.46519/ij3dptdi.1703106
Chicago Karabıyık, Muhammed Abdulhamid. “BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION”. International Journal of 3D Printing Technologies and Digital Industry 9, no. 2 (August 2025): 320-30. https://doi.org/10.46519/ij3dptdi.1703106.
EndNote Karabıyık MA (August 1, 2025) BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. International Journal of 3D Printing Technologies and Digital Industry 9 2 320–330.
IEEE M. A. Karabıyık, “BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION”, International Journal of 3D Printing Technologies and Digital Industry, vol. 9, no. 2, pp. 320–330, 2025, doi: 10.46519/ij3dptdi.1703106.
ISNAD Karabıyık, Muhammed Abdulhamid. “BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION”. International Journal of 3D Printing Technologies and Digital Industry 9/2 (August2025), 320-330. https://doi.org/10.46519/ij3dptdi.1703106.
JAMA Karabıyık MA. BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. International Journal of 3D Printing Technologies and Digital Industry. 2025;9:320–330.
MLA Karabıyık, Muhammed Abdulhamid. “BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION”. International Journal of 3D Printing Technologies and Digital Industry, vol. 9, no. 2, 2025, pp. 320-3, doi:10.46519/ij3dptdi.1703106.
Vancouver Karabıyık MA. BODE: A BOUNDARY AWARE DIFFERENTIAL EVOLUTION BASED HYBRID OVERSAMPLING TECHNIQUE FOR IMBALANCED DATA CLASSIFICATION. International Journal of 3D Printing Technologies and Digital Industry. 2025;9(2):320-3.

download

International Journal of 3D Printing Technologies and Digital Industry is lisenced under Creative Commons Atıf-GayriTicari 4.0 Uluslararası Lisansı