Araştırma Makalesi
BibTex RIS Kaynak Göster

Yıl 2025, Cilt: 9 Sayı: 2, 90 - 107, 31.12.2025
https://doi.org/10.52148/ehta.1768556

Öz

Etik Beyan

Çalışma için etik onay [T.C. ÇANKIRI KARATEKIN ÜNİVERSİTESİ Fen, Matematik ve Sosyal Bilimler Etik Kurulu]'ndan alınmıştır (Onay No: [44], Tarih: [23-08-2024]). Beyan edilecek herhangi bir çıkar çatışması yoktur.

Kaynakça

  • 1. Akın, P. (2023). A new hybrid approach based on genetic algorithm and support vector machine methods for hyperparameter optimization in synthetic minority over-sampling technique (SMOTE). AIMS Mathematics, 8(6), 9400–9415.
  • 2. Alzahrani, S. H., Saeedi, A. A., Baamer, M. K., Shalabi, A. F., & Alzahrani, A. M. (2020). Eating habits among medical students at king abdulaziz university, Jeddah, Saudi Arabia. International journal of general medicine, 77-88.
  • 3. Bikku, T. (2020). Multi-layered deep learning perceptron approach for health risk prediction. Journal of Big Data, 7(1), 50.
  • 4. Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 4, No. 4, p. 738). New York: springer.
  • 5. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
  • 6. Brownlee, J. (2020). Imbalanced classification with Python: better metrics, balance skewed classes, cost-sensitive learning. Machine Learning Mastery.
  • 7. Chatterjee, A., Gerdes, M. W., & Martinez, S. G. (2020). Identification of risk factors associated with obesity and overweight—a machine learning overview. Sensors, 20(9), 2734.
  • 8. Choudhuri, A. (2022). A hybrid machine learning model for estimation of obesity levels. In Data management, analytics and innovation conference (pp. 257–266). Springer. https://doi.org/10.1007/978-981-19-2600-6_22
  • 9. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
  • 10. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27.
  • 11. Dirik, M. (2023). Application of machine learning techniques for obesity prediction: a comparative study. Journal of complexity in Health Sciences, 6(2), 16-34.
  • 12. Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine learning, 29(2), 103-130.
  • 13. Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., ... & Lautenbach, S. (2013). Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27-46.
  • 14. Ferdowsy, F., Rahi, K. S. A., Jabiullah, M. I., & Habib, M. T. (2021). A machine learning approach for obesity risk prediction. Current Research in Behavioral Sciences, 2, 100053.
  • 15. Fernández, A., García, S., Galar, M., Prati, R. C., Krawczyk, B., & Herrera, F. (2018). Learning from imbalanced data sets (Vol. 10, No. 2018, p. 4). Cham: Springer.
  • 16. Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems?. The journal of machine learning research, 15(1), 3133-3181.
  • 17. Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O'Reilly Media, Inc.".
  • 18. Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine learning, 63(1), 3-42.
  • 19. Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. (No Title).
  • 20. He, H., Bai, Y., Garcia, E. A., & Li, S. (2008, June). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322-1328). Ieee.
  • 21. Helforoush, Z., & Sayyad, H. (2024). Prediction and classification of obesity risk based on a hybrid metaheuristic machine learning approach. Frontiers in big Data, 7, 1469981.
  • 22. Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression. John Wiley & Sons.
  • 23. Hruby, A., & Hu, F. B. (2015). The epidemiology of obesity: a big picture. Pharmacoeconomics, 33(7), 673-689.
  • 24. Kotsiantis, S., Kanellopoulos, D., & Pintelas, P. (2006). Handling imbalanced datasets: A review. GESTS international transactions on computer science and engineering, 30(1), 25-36.
  • 25. Musa, F., & Basaky, F. (2022). Obesity prediction using machine learning techniques. Journal of Applied Artificial Intelligence, 3(1), 24–33.
  • 26. Murtagh, F. (1991). Multilayer perceptrons for classification and regression. Neurocomputing, 2(5-6), 183-197.
  • 27. Naidu, G., Zuva, T., Sibanda, E.M. (2023). A Review of Evaluation Metrics in Machine Learning Algorithms. In: Silhavy, R., Silhavy, P. (eds) Artificial Intelligence Application in Networks and Systems. CSOC 2023. Lecture Notes in Networks and Systems, vol 724. Springer, Cham. https://doi.org/10.1007/978-3-031-35314-7_2
  • 28. Nelson, M. C., Story, M., Larson, N. I., Neumark-Sztainer, D., & Lytle, L. A. (2008). Emerging adulthood and college-aged youth: an overlooked age for weight-related behavior change. Obesity.
  • 29. Olagunju, M. T., Aleru, E. O., Abodunrin, O. R., Adedini, C. B., Ola, O. M., Abel, C., ... & Akinsolu, F. T. (2024). Association between meal skipping and the double burden of malnutrition among university students. North African Journal of Food and Nutrition Research, 8(17), 167-177.
  • 30. Şengul, S., Lopcu, K., & Cam, S. (2020). Determinants of the obesity of adults in Turkey: An empirical study. Review of applied socio-economic research, 20(2), 60-71.
  • 31. Pendergast, F. J., Livingstone, K. M., Worsley, A., & McNaughton, S. A. (2016). Correlates of meal skipping in young adults: a systematic review. International Journal of Behavioral Nutrition and Physical Activity, 13(1), 125.
  • 32. Rish, I. (2001, August). An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence (Vol. 3, No. 22, pp. 41-46).
  • 33. Şahin, C., & Korkmaz, Ö. (2011). İnternet bağımlılığı ölçeğinin Türkçeye uyarlanması. Selçuk Üniversitesi Ahmet Keleşoğlu Eğitim Fakültesi Dergisi, 32(1), 101-115.
  • 34. World Health Organization. (2024). Obesity and overweight. https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight
  • 35. Yağmur, N. (2024). A hybrid approach to obesity level determination with decision tree and pelican optimization algorithm. Journal of Scientific Reports-A, 57, 97–109. https://doi.org/10.59313/jsr-a.1447814

A Balanced Machine Learning Approach to Obesity Risk Classification: Comparative Analysis and Feature Importance

Yıl 2025, Cilt: 9 Sayı: 2, 90 - 107, 31.12.2025
https://doi.org/10.52148/ehta.1768556

Öz

Obesity is a growing public health concern, particularly among university students who are exposed to lifestyle changes, disordered eating habits, and reduced physical activity. The aim of this study is to classify obesity risk levels among university students using machine learning classification methods and to identify the most influential factors associated with this risk. The study sample consisted of data collected from 445 students studying at Çankırı Karatekin University. In this context, eight machine learning algorithms—Logistic Regression, Random Forest, Extra Trees, Support Vector Machines, K-Nearest Neighbor, Quadratic Discriminant Analysis, Naive Bayes, and Multilayer Perceptron—were compared to classify obesity risk. Class imbalance in the dataset was addressed using the Adaptive Synthetic Sampling (ADASYN) method applied exclusively to the training set. The models were evaluated using standard performance metrics, and the highest accuracy rate (96.26%) was achieved by the Random Forest model, followed by Logistic Regression with 94.77% accuracy. Variable importance analysis indicated that age, internet use scale score, and fast-food consumption frequency were the most influential factors in classification, while the low correlation between variables (|r| < 0.2) suggested that model performance was driven by the combined contribution of multiple features. Overall, the findings demonstrate that the balanced machine learning approach, particularly ensemble-based methods, can classify obesity risk with high accuracy and provide valuable insights for targeted prevention strategies among university students.

Etik Beyan

Ethical approval for the study was obtained from the [T.C. ÇANKIRI KARATEKIN UNIVERSITY Science, Mathematics and Social Sciences Ethics Committee] (Approval No: [44], Date: [23-08-2024]). There are no conflicts of interest to declare.

Kaynakça

  • 1. Akın, P. (2023). A new hybrid approach based on genetic algorithm and support vector machine methods for hyperparameter optimization in synthetic minority over-sampling technique (SMOTE). AIMS Mathematics, 8(6), 9400–9415.
  • 2. Alzahrani, S. H., Saeedi, A. A., Baamer, M. K., Shalabi, A. F., & Alzahrani, A. M. (2020). Eating habits among medical students at king abdulaziz university, Jeddah, Saudi Arabia. International journal of general medicine, 77-88.
  • 3. Bikku, T. (2020). Multi-layered deep learning perceptron approach for health risk prediction. Journal of Big Data, 7(1), 50.
  • 4. Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 4, No. 4, p. 738). New York: springer.
  • 5. Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
  • 6. Brownlee, J. (2020). Imbalanced classification with Python: better metrics, balance skewed classes, cost-sensitive learning. Machine Learning Mastery.
  • 7. Chatterjee, A., Gerdes, M. W., & Martinez, S. G. (2020). Identification of risk factors associated with obesity and overweight—a machine learning overview. Sensors, 20(9), 2734.
  • 8. Choudhuri, A. (2022). A hybrid machine learning model for estimation of obesity levels. In Data management, analytics and innovation conference (pp. 257–266). Springer. https://doi.org/10.1007/978-981-19-2600-6_22
  • 9. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
  • 10. Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE transactions on information theory, 13(1), 21-27.
  • 11. Dirik, M. (2023). Application of machine learning techniques for obesity prediction: a comparative study. Journal of complexity in Health Sciences, 6(2), 16-34.
  • 12. Domingos, P., & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. Machine learning, 29(2), 103-130.
  • 13. Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., ... & Lautenbach, S. (2013). Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27-46.
  • 14. Ferdowsy, F., Rahi, K. S. A., Jabiullah, M. I., & Habib, M. T. (2021). A machine learning approach for obesity risk prediction. Current Research in Behavioral Sciences, 2, 100053.
  • 15. Fernández, A., García, S., Galar, M., Prati, R. C., Krawczyk, B., & Herrera, F. (2018). Learning from imbalanced data sets (Vol. 10, No. 2018, p. 4). Cham: Springer.
  • 16. Fernández-Delgado, M., Cernadas, E., Barro, S., & Amorim, D. (2014). Do we need hundreds of classifiers to solve real world classification problems?. The journal of machine learning research, 15(1), 3133-3181.
  • 17. Géron, A. (2022). Hands-on machine learning with Scikit-Learn, Keras, and TensorFlow. " O'Reilly Media, Inc.".
  • 18. Geurts, P., Ernst, D., & Wehenkel, L. (2006). Extremely randomized trees. Machine learning, 63(1), 3-42.
  • 19. Friedman, J. (2009). The elements of statistical learning: Data mining, inference, and prediction. (No Title).
  • 20. He, H., Bai, Y., Garcia, E. A., & Li, S. (2008, June). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322-1328). Ieee.
  • 21. Helforoush, Z., & Sayyad, H. (2024). Prediction and classification of obesity risk based on a hybrid metaheuristic machine learning approach. Frontiers in big Data, 7, 1469981.
  • 22. Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression. John Wiley & Sons.
  • 23. Hruby, A., & Hu, F. B. (2015). The epidemiology of obesity: a big picture. Pharmacoeconomics, 33(7), 673-689.
  • 24. Kotsiantis, S., Kanellopoulos, D., & Pintelas, P. (2006). Handling imbalanced datasets: A review. GESTS international transactions on computer science and engineering, 30(1), 25-36.
  • 25. Musa, F., & Basaky, F. (2022). Obesity prediction using machine learning techniques. Journal of Applied Artificial Intelligence, 3(1), 24–33.
  • 26. Murtagh, F. (1991). Multilayer perceptrons for classification and regression. Neurocomputing, 2(5-6), 183-197.
  • 27. Naidu, G., Zuva, T., Sibanda, E.M. (2023). A Review of Evaluation Metrics in Machine Learning Algorithms. In: Silhavy, R., Silhavy, P. (eds) Artificial Intelligence Application in Networks and Systems. CSOC 2023. Lecture Notes in Networks and Systems, vol 724. Springer, Cham. https://doi.org/10.1007/978-3-031-35314-7_2
  • 28. Nelson, M. C., Story, M., Larson, N. I., Neumark-Sztainer, D., & Lytle, L. A. (2008). Emerging adulthood and college-aged youth: an overlooked age for weight-related behavior change. Obesity.
  • 29. Olagunju, M. T., Aleru, E. O., Abodunrin, O. R., Adedini, C. B., Ola, O. M., Abel, C., ... & Akinsolu, F. T. (2024). Association between meal skipping and the double burden of malnutrition among university students. North African Journal of Food and Nutrition Research, 8(17), 167-177.
  • 30. Şengul, S., Lopcu, K., & Cam, S. (2020). Determinants of the obesity of adults in Turkey: An empirical study. Review of applied socio-economic research, 20(2), 60-71.
  • 31. Pendergast, F. J., Livingstone, K. M., Worsley, A., & McNaughton, S. A. (2016). Correlates of meal skipping in young adults: a systematic review. International Journal of Behavioral Nutrition and Physical Activity, 13(1), 125.
  • 32. Rish, I. (2001, August). An empirical study of the naive Bayes classifier. In IJCAI 2001 workshop on empirical methods in artificial intelligence (Vol. 3, No. 22, pp. 41-46).
  • 33. Şahin, C., & Korkmaz, Ö. (2011). İnternet bağımlılığı ölçeğinin Türkçeye uyarlanması. Selçuk Üniversitesi Ahmet Keleşoğlu Eğitim Fakültesi Dergisi, 32(1), 101-115.
  • 34. World Health Organization. (2024). Obesity and overweight. https://www.who.int/news-room/fact-sheets/detail/obesity-and-overweight
  • 35. Yağmur, N. (2024). A hybrid approach to obesity level determination with decision tree and pelican optimization algorithm. Journal of Scientific Reports-A, 57, 97–109. https://doi.org/10.59313/jsr-a.1447814
Toplam 35 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Sağlık ve Ekolojik Risk Değerlendirmesi, Dijital Sağlık
Bölüm Araştırma Makalesi
Yazarlar

Haydar Koç 0000-0002-8568-4717

Tuba Koc 0000-0001-5204-0846

Gönderilme Tarihi 19 Ağustos 2025
Kabul Tarihi 17 Kasım 2025
Yayımlanma Tarihi 31 Aralık 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 9 Sayı: 2

Kaynak Göster

APA Koç, H., & Koc, T. (2025). A Balanced Machine Learning Approach to Obesity Risk Classification: Comparative Analysis and Feature Importance. Eurasian Journal of Health Technology Assessment, 9(2), 90-107. https://doi.org/10.52148/ehta.1768556

Açık erişimli ve çift-kör hakemli bir dergidir.

Dergi içeriği tüm kullanıcılara ücretsiz olarak sunulmaktadır.
Dergideki yazıların bilimsel sorumluluğu yazarlarına aittir.
Dergimizde yayınlanmış makaleler kaynak gösterilmeden kullanılamaz
© T.C. Sağlık Bakanlığı Sağlık Hizmetleri Genel Müdürlüğü Araştırma, Geliştirme ve Sağlık Teknolojisi Değerlendirme Daire Başkanlığı
Tüm Hakları Türkiye Cumhuriyeti Sağlık Bakanlığı Sağlık Hizmetleri Genel Müdürlüğüne aittir.