TY - JOUR T1 - Performance Comparison of Machine Learning Algorithms Using Oversampling Methods to Predict Childhood Anemia TT - Çocukluk Çağı Anemisinin Tahmininde Aşırı Örnekleme Yöntemlerini Kullanan Makine Öğrenmesi Algoritmalarının Performans Karşılaştırması AU - Balbal, Kadriye Filiz PY - 2025 DA - November Y2 - 2024 DO - 10.7212/karaelmasfen.1555212 JF - Karaelmas Fen ve Mühendislik Dergisi PB - Zonguldak Bülent Ecevit Üniversitesi WT - DergiPark SN - 2146-7277 SP - 1 EP - 11 VL - 15 IS - 3 LA - en AB - Childhood anemia is a major health problem. Anemia, which is common in preschool-aged children, causes physical and mental developmental delays in this age group. It is important to address and investigate these preventable and treatable health problems using state-of-the-art methods such as artificial intelligence. Therefore, this study employs machine learning techniques, a subfield of artificial intelligence, to predict the anemia levels in children aged 0–59 months in Nigeria. To address the issue of data imbalance, which can cause problems in estimating childhood anemia levels, the SMOTE and ADASYN oversampling techniques were employed in this study. In the analyses performed with the newly obtained balanced data, it was observed that the SMOTE and ADASYN methods performed significantly better than the results obtained with imbalanced data for all ML models. When the average results of all ML algorithms used in this study in terms of accuracy, precision, recall, and F1 score metrics are compared to the oversampling methods, the most successful result in terms of all metrics was obtained with the SMOTE method. KW - Artificial intelligence KW - machine learning KW - ADASYN KW - SMOTE KW - childhood anemia N2 - Çocukluk çağı anemisi önemli bir sağlık sorunudur. Okul öncesi çağdaki çocuklarda sık görülen anemi, bu yaş grubunda fiziksel ve zihinsel gelişimsel gecikmelere neden olur. Yapay zeka gibi son teknoloji yöntemlerle önlenebilir ve tedavi edilebilir bu sağlık sorunlarının ele alınması ve araştırılması önemlidir. Bu nedenle, bu çalışmada Nijerya'da 0-59 aylık çocuklarda anemi düzeylerini tahmin etmek için yapay zekanın bir alt alanı olan makine öğrenmesi teknikleri kullanılmıştır. Çocukluk çağı anemi düzeylerini tahmin etmede sorunlara neden olabilen veri dengesizliği sorununu ele almak için çalışmada SMOTE ve ADASYN aşırı örnekleme teknikleri kullanılmıştır. Yeni elde edilen dengeli verilerle yapılan analizlerde, SMOTE ve ADASYN yöntemlerinin tüm ML modelleri için dengesiz verilerle elde edilen sonuçlardan önemli ölçüde daha iyi performans gösterdiği görülmüştür. Çalışmada kullanılan tüm ML algoritmalarının doğruluk, kesinlik, geri çağırma ve F1 puanı metrikleri açısından ortalama sonuçları, aşırı örnekleme yöntemleriyle karşılaştırıldığında, tüm metrikler açısından en başarılı sonuç SMOTE yöntemi ile elde edilmiştir. CR - Aha, DW., Kibler, D., Albert, MK., Quinian, JR. 1991. Instance-based learning algorithms. Machine Learning 1991 6:1, 6(1), 37–66. Doi: 10.1007/BF00153759 CR - Ajakaye, OG., Ibukunoluwa, MR. 2020. Prevalence and risk of malaria, anemia and malnutrition among children in IDPs camp in Edo State, Nigeria. Parasite Epidemiology and Control, 8: e00127. Doi: 10.1016/j.parepi.2019.e00127 CR - Anand, P., Gupta, R., Sharma, A. 2019. Prediction of Anaemia among children using Machine Learning Algorithms. 11(2), 469–480. CR - Aynalem, M., Shiferaw, E., Adane, T., Gelaw, Y., Enawgaw, B. 2022. Anemia in African malnourished pre-school children: A systematic review and meta-analysis. SAGE open medicine, 10. Doi: 10.1177/20503121221088433 CR - Bitew, FH., Sparks, CS., Nyarko, SH. 2022. Machine learning algorithms for predicting undernutrition among under-five children in Ethiopia. Public Health Nutrition, 25(2), 269-280. Doi: 10.1017/S1368980021004262 CR - Chawla, NV., Bowyer, KW., Hall, LO., Kegelmeyer, WP. 2002. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357. Doi: 10.1613/JAIR.953 CR - Davagdorj, K., Lee, JS., Pham, VH., Ryu, KH. 2020. A comparative analysis of machine learning methods for class imbalance in a smoking cessation intervention. Applied Sciences, 10(9), 3307. Doi: 10.3390/app10093307 CR - Getawa, S., Getaneh, Z., Melku, M. 2020. Hematological abnormalities and associated factors among undernourished under-five children attending University of Gondar Specialized Referral Hospital, Northwest Ethiopia. Journal of Blood Medicine, 465-478. https://doi.org/10.2147/JBM.S284572 CR - Halim, AM., Dwifebri, M., Nhita, F. 2023. Handling Imbalanced Data Sets Using SMOTE and ADASYN to Improve Classification Performance of Ecoli Data Sets. Building of Informatics, Technology and Science (BITS), 5(1). Doi: 10.47065/BITS.V5I1.3647 CR - Harrington, P. 2012. Machine Learning in Action (1st Edition). Manning Publications. CR - He, H., Bai, Y., Garcia, EA., Li S. 2008. ADASYN: Adaptive synthetic sampling approach for imbalanced learning. IEEE International Joint Conference on Neural Networks (IEEE, world congress on computational intelligence),1322-1328. CR - Kebede Kassaw, A., Yimer, A., Abey, W., Molla, TL., Zemariam, AB. 2023. The application of machine learning approaches to determine the predictors of anemia among under five children in Ethiopia. Scientific Reports, 13(1). Doi: 10.1038/s41598-023-50128-x CR - Khan, JR., Chowdhury, S., Islam, H., Raheem, E. 2021. Machine Learning Algorithms To Predict The Childhood Anemia In Bangladesh. Journal of Data Science, 17(1), 195–218. Doi: 10.6339/jds.201901_17(1).0009 CR - Marcos Valdez, AJ., Navarro Ortiz, EG., Quinteros Peralta, RE., Tirado Julca, JJ., Valentin Ricaldi, DF., Calderon-Vilca, HD. 2023. Machine Learning for the Prediction of Anemia in Children Under 5 Years of Age by Analyzing their Nutritional Status Using Data Mining. Computacion y Sistemas, 27(3), 749–768. Doi: 10.13053/CyS-27-3-4315 CR - Mason, L., Baxter, J., Bartlett, P., Frean, M. 1999. Boosting algorithms as gradient descent. Advances in Neural Information Processing Systems, 12. CR - Meitei, AJ., Saini, A., Mohapatra, BB., Singh, KJ. 2022. Predicting child anaemia in the North-Eastern states of India: a machine learning approach. International Journal of System Assurance Engineering and Management, 13(6), 2949-2962. Doi: 10.1007/s13198-022-01765-4 CR - Naskath, J., Sivakamasundari, G., Begum, AAS. 2023. A Study on Different Deep Learning Algorithms Used in Deep Neural Nets: MLP SOM and DBN. Wireless Personal Communications, 128(4), 2913–2936.Doi: 10.1007/s11277-022-10079-4 CR - Pajila, PJB., Sheena, BG., Gayathri, A., Aswini, J., Nalini, M., Siva Subramanian, R. 2023. A Comprehensive Survey on Naive Bayes Algorithm: Advantages, Limitations and Applications. Proceedings of the 4th International Conference on Smart Electronics and Communication, ICOSEC 2023, 1228–1234. Doi: 10.1109/ICOSEC58147.2023.10276274 CR - Pedregosa, F., Michel, V., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., … Fré. 2011. Scikit-learn: Machine Learning in Python. The Journal of Machine Learning Research, 12, 2825–2830. Doi: 10.5555/1953048.2078195 CR - Rahmani, AM., Yousefpoor, E., Yousefpoor, MS., Mehmood, Z., Haider, A., Hosseinzadeh, M., Ali Naqvi, R. 2021. Machine Learning (ML) in Medicine: Review, Applications, and Challenges. Mathematics, 9(22), 2970. Doi: 10.3390/math9222970 CR - Rajula, HSR., Verlato, G., Manchia, M., Antonucci, N., Fanos, V. 2020. Comparison of conventional statistical methods with machine learning in medicine: diagnosis, drug development, and treatment. Medicina. 56(9), 455. Doi: 10.3390/medicina56090455 CR - Sarker, IH. 2021. Machine Learning: Algorithms, Real-World Applications and Research Directions. SN Computer Science, 2(3), 1–21. Doi: 10.1007/s42979-021-00592-x CR - Sarker, IH., Kayes, ASM., Watters, P. 2019. Effectiveness analysis of machine learning classification models for predicting personalized context-aware smartphone usage. Journal of Big Data, 6(1). Doi: 10.1186/S40537-019-0219-Y CR - Tesfaye, SH., Seboka, BT., Sisay, D. 2024. Application of machine learning methods for predicting childhood anaemia: Analysis of Ethiopian Demographic Health Survey of 2016. Plos one, 19(4), e0300172. Doi: 10.1371/journal.pone.0300172 CR - WHO 2023. Anaemia Factsheet. https://www.who.int/news-room/fact-sheets/detail/anaemia (accessed on 02 September 2024). CR - Zemariam, AB., Yimer, A., Abebe, GK., Wondie, WT., Abate, BB., Alamaw, AW., … Ngusie, HS. 2024. Employing supervised machine learning algorithms for classification and prediction of anemia among youth girls in Ethiopia. Scientific Reports 2024 14:1, 14(1), 1–17. Doi: 10.1038/s41598-024-60027-4 UR - https://doi.org/10.7212/karaelmasfen.1555212 L1 - https://dergipark.org.tr/tr/download/article-file/4236895 ER -