Research Article
BibTex RIS Cite

K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini

Year 2024, Volume: 8 Issue: 3, 265 - 276, 30.12.2024

Abstract

Amaç: Çalışmamızın amacı dünya çapında giderek artan ve önemli bir halk sağlığı sorunu hâline gelen diabetes mellitus hastalığının
makine öğrenme yöntemi ile tahmin edilmesidir.
Gereç ve Yöntemler: Çalışmada diabetes mellitus sağlık göstergelerini içeren ve kaggle veri tabanından elde edilen 253.680 örnek
hacmine sahip veri kayıtları kullanılmıştır. K en yakın komşu yöntemi ile hastaların diabetes mellitus durumları makine öğrenme
yaklaşımıyla tahmin edilmeye çalışılmıştır. Tüm işlemler R programı ile gerçekleştirilmiştir.
Bulgular: Kişilerin yaklaşık %15,8’i preDM ya da diabetes mellitus tanılıdır, %42,9’unde yüksek tansiyon, %42,4’ünde yüksek kolesterol
bulunmaktadır. Sigara içenlerin oranı %44,3, ağır alkol tüketenlerin oranı ise %5,6’dır. Kalp hastalığı/krizi geçirenleri oranı ise %9,4,
yürüyüşte zorluk çektiğini bildirenlerin oranı ise %16,8’dir. Fiziksel aktivitesi bulunmayanların oranı %24,4’tür. Diabetes mellitus tanısı
olmayanların BMI ortalaması 27,74±6,26 iken diyabet hastası olanların BMI ortalaması 31,94±7,36 olarak bulunmuştur. K en yakın
komşu yöntemi ile yapılan uygulamada diabetes mellitus tahmini en iyi eğitim ve test verisinin %90,0-%10,0 olarak ayrıldığı ve K
komşuluk değerinin 3 (üç) alındığı durumda elde edilmiştir. İlgili belirteçler kullanılarak %97,2 doğruluk ve %88,9 kappa başarı değeri
ile diabetes mellitus hastalığına sahip kişiler doğru tahmin edilebilmiştir.
Sonuç: Makine öğrenme yöntemlerinin son yıllarda birçok alanda kullanımının yaygınlaştığı ve başarılı sonuçlar verdiği literatürde
bildirilmektedir. Bu araştırmada da makine öğrenme yaklaşımıyla diabetes mellitus tahmininin yüksek başarı oranı ile gerçekleştirildiği
uygulamalı olarak gösterilmiştir. Diabetes mellitus hastalığının sessiz ve artan sayıda ilerlediği bilindiğinden erken tanı hayati öneme
sahiptir. K en yakın komşu yönteminin kolay uygulanabilirliği ve yüksek sınıflama performansı gibi avantajlarından dolayı diabetes
mellitus hastalığının erken tanı ve tedavisi için sağlık hizmeti sağlayıcıları tarafından kullanılması önerilmektedir.

References

  • 1. Kır Biçer E, Çekiç M, Ayvazoğlu G. Üniversite Çalışanlarında Tip 2 Diyabet Riskinin ve İlişkili Faktörlerin Değerlendirilmesi. IGUSABDER. 2024;253–272.
  • 2. Oliullah K, Rasel MH, Islam, MM. et al. A stacked ensemble machine learning approach for the prediction of diabetes. J Diabetes Metab Disord 23, 603–617 (2024). https://doi. org/10.1007/s40200-023-01321-2
  • 3. Elsayed N, ElSayed Z and Ozer M. “Early Stage Diabetes Prediction via Extreme Learning Machine,” SoutheastCon 2022, Mobile, AL, USA, 2022, pp. 374-379, doi: 10.1109/Southeast- Con48659.2022.9764032.
  • 4. Dritsas E, Trigka M. Data-Driven Machine-Learning Methods for Diabetes Risk Prediction. Sensors. 2022; 22(14):5304. https:// doi.org/10.3390/s22145304
  • 5. Al-Haija QA, Smadi M, Al-Bataineh OM. Early Stage Diabetes Risk Prediction via Machine Learning. In: Abraham, A., et al. Proceedings of the 13th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2021) (2022). Lecture Notes in Networks and Systems, vol 417. Springer, Cham. https://doi.org/10.1007/978-3-030-96302-6_42.
  • 6. International Diabetes Federation. IDF Diabetes Atlas: 10th edition 2021. https://diabetesatlas.org/data/en/country/203/ tr.html Erişim Tarihi:07.07.2024.
  • 7. Bishop CM. Pattern Recognition and Machine Learning. Springer, ISBN: 0-387- 31073-8 (2007).
  • 8. Alpaydin E. Introduction to Machine Learning. London: The MIT Press (2010).
  • 9. Khakurel U, Abdelmoumin G, Bajracharya A, Rawat DB. “Exploring bias and fairness in artificial intelligence and machine learning algorithms”, Proc. SPIE 12113, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV, 1211324 (6 June 2022), https://doi. org/10.1117/12.2621282.
  • 10. Islam MS, Qaraqe MK, Abbas HT, Erraguntla M and Abdul- Ghani M. “The Prediction of Diabetes Development: A Machine Learning Framework,” 2020 IEEE 5th Middle East and Africa Conference on Biomedical Engineering (MECBME), Amman, Jordan, 2020, pp. 1-6, doi: 10.1109/MECBME47393.2020.9292043.
  • 11. Deberneh HM, Kim I. Prediction of Type 2 Diabetes Based on Machine Learning Algorithm. International Journal of Environmental Research and Public Health. 2021; 18(6):3317. https:// doi.org/10.3390/ijerph18063317
  • 12. Singh Y, Tiwari M. Revolutionizing Diabetes Disease Prediction Through Novel Machine Learning Techniques. Nano. 2024;19(4). https://doi.org/10.1142/S179329202350056X
  • 13. Islam MS, Minul Alam M, Ahamed A and Ali Meerza SI. “Prediction of Diabetes at Early Stage using Interpretable Machine Learning,” SoutheastCon 2023, Orlando, FL, USA, 2023, pp. 261-265, doi: 10.1109/SoutheastCon51012.2023.10115152.
  • 14. Bassam G, Rouai A, Ahmad R and Khan MA. “Diabetes Prediction Empowered with Multi-level Data Fusion and Machine Learning” International Journal of Advanced Computer Science and Applications(IJACSA), 14(10), 2023. http://dx.doi. org/10.14569/IJACSA.2023.0141062
  • 15. Abnoosian K, Farnoosh R and Behzadi MH. Prediction of diabetes disease using an ensemble of machine learning multi-classifier models. BMC Bioinformatics 24, 337 (2023). https:// doi.org/10.1186/s12859-023-05465-z
  • 16. Ahmed U et al. “Prediction of Diabetes Empowered With Fused Machine Learning,” in IEEE Access, vol. 10, pp. 8529-8538, 2022, doi: 10.1109/ACCESS.2022.3142097.
  • 17. UC Irvine Machine Learning Repository. CDC Diabetes Health Indicators. https://archive.ics.uci.edu/dataset/891/cdc+- diabetes+health+indicators. Erişim Tarihi: 13.04.2024. DOI 10.24432/C53919
  • 18. R3.6.0, https://cran.r-project.org/bin/windows/base/old/
  • 19. Powers DMW. “Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation”. Journal of Machine Learning Technologies. 2011;2 (1): 37–63.
  • 20. Stehman SV. “Selecting and interpreting measures of thematic classification accuracy”. Remote Sensing of Environment. 1997;62 (1): 77–89. doi:10.1016/S0034-4257(97)00083-7
  • 21. Metz CE. “Basic principles of ROC analysis” (PDF). Semin Nucl Med.1978;8 (4): 283–98. doi:10.1016/s0001-2998(78)80014-2.
  • 22. Sim J, Wright CC. “The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements” in Physical Therapy. 2005;85, 257-268.
  • 23. Cunningham P, Delany SJ. K-Neighbor Classifiers. J Multiple Classifier Syst. 2007;34(8):1-17.
  • 24. Özkan Y, Sarer Yürekli B, Suner A. Diyabet tanısının tahminlenmesinde denetimli makine öğrenme algoritmalarının performans karşılaştırması. Gümüşhane Üniversitesi Fen Bilimleri Dergisi, 2022;12(1), 211-226. https://doi.org/10.17714/ gumusfenbil.820882
  • 25. Nadeem MW, Goh HG, Ponnusamy V, Andonovic I, Khan MA, Hussain M. A Fusion-Based Machine Learning Approach for the Prediction of the Onset of Diabetes. Healthcare. 2021; 9(10):1393. https://doi.org/10.3390/healthcare9101393.
  • 26. Turhan S, Özkan Y, Yürekli BS, Suner A, Doğu E. Sınıf Dengesizliği Varlığında Hastalık Tanısı için Kolektif Öğrenme Yöntemlerinin Karşılaştırılması: Diyabet Tanısı Örneği. Turkiye Klinikleri J Biostat. 2020;12(1):16-26. DOI: 10.5336/biostatic. 2019-66816
  • 27. Demirarslan M, Suner A. Sağlık Veri Setlerinde Öznitelik Seçiminin Sınıflandırma Performansına Etkisi. JAIHS 2021; 1(1):6-11. DOI 10.52309/jai.2021.2
  • 28. Ağlarcı AV, Bal C. Effect of various factors on classification performance of ordinal logistic regression. International Journal of Data Mining, Modelling and Management. 2024;16(2):196- 208. https://doi.org/10.1504/IJDMMM.2024.138813.

Diabetes Mellitus Prediction Based on K Nearest Neighbor Machine Learning Algorithm

Year 2024, Volume: 8 Issue: 3, 265 - 276, 30.12.2024

Abstract

Aim: The aim of our study is to predict diabetes mellitus, which is increasing worldwide and has become an important public health
problem, with machine learning method.
Material and Methods: In the study, data records containing diabetes mellitus health indicators with a sample size of 253,680 obtained
from the kaggle database were used. K nearest neighbor method was used to predict the diabetes mellitus status of the patients with a
machine learning approach. All operations were performed with the R program.
Results: Approximately 15.8% of the individuals were diagnosed with preDM or diabetes mellitus, 42.9% had high blood pressure and
42.4% had high cholesterol. 44.3% were smokers and 5.6% were heavy alcohol consumers. The rate of those who have had heart disease/
crisis is 9.4%, and the rate of those who reported having difficulty in walking is 16.8%. The rate of those with no physical activity was
24.4%. The mean BMI of those without diabetes mellitus was 27.74±6.26, while the mean BMI of those with diabetes mellitus was
31.94±7.36. In the application with the k nearest neighbor method, the best prediction of diabetes mellitus was obtained when the
training and test data were separated as 90.0%-10.0% and the k neighborhood value was 3 (three). Using the relevant markers, people
with diabetes mellitus disease were correctly predicted with 97.2% accuracy and 88.9% kappa success value.
Conclusion: It is reported in the literature that machine learning methods have been widely used in many fields in recent years and have
yielded successful results. In this study, it has been demonstrated that the prediction of diabetes mellitus with machine learning approach
is realized with a high success rate. Since diabetes mellitus is known to progress silently and in increasing numbers, early diagnosis is of
vital importance. Due to the advantages of K nearest neighbor method such as easy applicability and high classification performance, it
is recommended to be used by healthcare providers for early diagnosis and treatment of diabetes mellitus.

References

  • 1. Kır Biçer E, Çekiç M, Ayvazoğlu G. Üniversite Çalışanlarında Tip 2 Diyabet Riskinin ve İlişkili Faktörlerin Değerlendirilmesi. IGUSABDER. 2024;253–272.
  • 2. Oliullah K, Rasel MH, Islam, MM. et al. A stacked ensemble machine learning approach for the prediction of diabetes. J Diabetes Metab Disord 23, 603–617 (2024). https://doi. org/10.1007/s40200-023-01321-2
  • 3. Elsayed N, ElSayed Z and Ozer M. “Early Stage Diabetes Prediction via Extreme Learning Machine,” SoutheastCon 2022, Mobile, AL, USA, 2022, pp. 374-379, doi: 10.1109/Southeast- Con48659.2022.9764032.
  • 4. Dritsas E, Trigka M. Data-Driven Machine-Learning Methods for Diabetes Risk Prediction. Sensors. 2022; 22(14):5304. https:// doi.org/10.3390/s22145304
  • 5. Al-Haija QA, Smadi M, Al-Bataineh OM. Early Stage Diabetes Risk Prediction via Machine Learning. In: Abraham, A., et al. Proceedings of the 13th International Conference on Soft Computing and Pattern Recognition (SoCPaR 2021) (2022). Lecture Notes in Networks and Systems, vol 417. Springer, Cham. https://doi.org/10.1007/978-3-030-96302-6_42.
  • 6. International Diabetes Federation. IDF Diabetes Atlas: 10th edition 2021. https://diabetesatlas.org/data/en/country/203/ tr.html Erişim Tarihi:07.07.2024.
  • 7. Bishop CM. Pattern Recognition and Machine Learning. Springer, ISBN: 0-387- 31073-8 (2007).
  • 8. Alpaydin E. Introduction to Machine Learning. London: The MIT Press (2010).
  • 9. Khakurel U, Abdelmoumin G, Bajracharya A, Rawat DB. “Exploring bias and fairness in artificial intelligence and machine learning algorithms”, Proc. SPIE 12113, Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications IV, 1211324 (6 June 2022), https://doi. org/10.1117/12.2621282.
  • 10. Islam MS, Qaraqe MK, Abbas HT, Erraguntla M and Abdul- Ghani M. “The Prediction of Diabetes Development: A Machine Learning Framework,” 2020 IEEE 5th Middle East and Africa Conference on Biomedical Engineering (MECBME), Amman, Jordan, 2020, pp. 1-6, doi: 10.1109/MECBME47393.2020.9292043.
  • 11. Deberneh HM, Kim I. Prediction of Type 2 Diabetes Based on Machine Learning Algorithm. International Journal of Environmental Research and Public Health. 2021; 18(6):3317. https:// doi.org/10.3390/ijerph18063317
  • 12. Singh Y, Tiwari M. Revolutionizing Diabetes Disease Prediction Through Novel Machine Learning Techniques. Nano. 2024;19(4). https://doi.org/10.1142/S179329202350056X
  • 13. Islam MS, Minul Alam M, Ahamed A and Ali Meerza SI. “Prediction of Diabetes at Early Stage using Interpretable Machine Learning,” SoutheastCon 2023, Orlando, FL, USA, 2023, pp. 261-265, doi: 10.1109/SoutheastCon51012.2023.10115152.
  • 14. Bassam G, Rouai A, Ahmad R and Khan MA. “Diabetes Prediction Empowered with Multi-level Data Fusion and Machine Learning” International Journal of Advanced Computer Science and Applications(IJACSA), 14(10), 2023. http://dx.doi. org/10.14569/IJACSA.2023.0141062
  • 15. Abnoosian K, Farnoosh R and Behzadi MH. Prediction of diabetes disease using an ensemble of machine learning multi-classifier models. BMC Bioinformatics 24, 337 (2023). https:// doi.org/10.1186/s12859-023-05465-z
  • 16. Ahmed U et al. “Prediction of Diabetes Empowered With Fused Machine Learning,” in IEEE Access, vol. 10, pp. 8529-8538, 2022, doi: 10.1109/ACCESS.2022.3142097.
  • 17. UC Irvine Machine Learning Repository. CDC Diabetes Health Indicators. https://archive.ics.uci.edu/dataset/891/cdc+- diabetes+health+indicators. Erişim Tarihi: 13.04.2024. DOI 10.24432/C53919
  • 18. R3.6.0, https://cran.r-project.org/bin/windows/base/old/
  • 19. Powers DMW. “Evaluation: From Precision, Recall and F-Measure to ROC, Informedness, Markedness & Correlation”. Journal of Machine Learning Technologies. 2011;2 (1): 37–63.
  • 20. Stehman SV. “Selecting and interpreting measures of thematic classification accuracy”. Remote Sensing of Environment. 1997;62 (1): 77–89. doi:10.1016/S0034-4257(97)00083-7
  • 21. Metz CE. “Basic principles of ROC analysis” (PDF). Semin Nucl Med.1978;8 (4): 283–98. doi:10.1016/s0001-2998(78)80014-2.
  • 22. Sim J, Wright CC. “The Kappa Statistic in Reliability Studies: Use, Interpretation, and Sample Size Requirements” in Physical Therapy. 2005;85, 257-268.
  • 23. Cunningham P, Delany SJ. K-Neighbor Classifiers. J Multiple Classifier Syst. 2007;34(8):1-17.
  • 24. Özkan Y, Sarer Yürekli B, Suner A. Diyabet tanısının tahminlenmesinde denetimli makine öğrenme algoritmalarının performans karşılaştırması. Gümüşhane Üniversitesi Fen Bilimleri Dergisi, 2022;12(1), 211-226. https://doi.org/10.17714/ gumusfenbil.820882
  • 25. Nadeem MW, Goh HG, Ponnusamy V, Andonovic I, Khan MA, Hussain M. A Fusion-Based Machine Learning Approach for the Prediction of the Onset of Diabetes. Healthcare. 2021; 9(10):1393. https://doi.org/10.3390/healthcare9101393.
  • 26. Turhan S, Özkan Y, Yürekli BS, Suner A, Doğu E. Sınıf Dengesizliği Varlığında Hastalık Tanısı için Kolektif Öğrenme Yöntemlerinin Karşılaştırılması: Diyabet Tanısı Örneği. Turkiye Klinikleri J Biostat. 2020;12(1):16-26. DOI: 10.5336/biostatic. 2019-66816
  • 27. Demirarslan M, Suner A. Sağlık Veri Setlerinde Öznitelik Seçiminin Sınıflandırma Performansına Etkisi. JAIHS 2021; 1(1):6-11. DOI 10.52309/jai.2021.2
  • 28. Ağlarcı AV, Bal C. Effect of various factors on classification performance of ordinal logistic regression. International Journal of Data Mining, Modelling and Management. 2024;16(2):196- 208. https://doi.org/10.1504/IJDMMM.2024.138813.
There are 28 citations in total.

Details

Primary Language Turkish
Subjects Endocrinology
Journal Section Research Article
Authors

Ali Vasfi Ağlarcı 0000-0002-9010-4537

Feridun Karakurt 0000-0001-7629-9625

Publication Date December 30, 2024
Submission Date September 13, 2024
Acceptance Date December 19, 2024
Published in Issue Year 2024 Volume: 8 Issue: 3

Cite

APA Ağlarcı, A. V., & Karakurt, F. (2024). K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini. Turkish Journal of Diabetes and Obesity, 8(3), 265-276.
AMA Ağlarcı AV, Karakurt F. K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini. Turk J Diab Obes. December 2024;8(3):265-276.
Chicago Ağlarcı, Ali Vasfi, and Feridun Karakurt. “K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini”. Turkish Journal of Diabetes and Obesity 8, no. 3 (December 2024): 265-76.
EndNote Ağlarcı AV, Karakurt F (December 1, 2024) K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini. Turkish Journal of Diabetes and Obesity 8 3 265–276.
IEEE A. V. Ağlarcı and F. Karakurt, “K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini”, Turk J Diab Obes, vol. 8, no. 3, pp. 265–276, 2024.
ISNAD Ağlarcı, Ali Vasfi - Karakurt, Feridun. “K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini”. Turkish Journal of Diabetes and Obesity 8/3 (December 2024), 265-276.
JAMA Ağlarcı AV, Karakurt F. K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini. Turk J Diab Obes. 2024;8:265–276.
MLA Ağlarcı, Ali Vasfi and Feridun Karakurt. “K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini”. Turkish Journal of Diabetes and Obesity, vol. 8, no. 3, 2024, pp. 265-76.
Vancouver Ağlarcı AV, Karakurt F. K En Yakın Komşu Makine Öğrenme Algoritmasına Dayalı Diabetes Mellitus Tahmini. Turk J Diab Obes. 2024;8(3):265-76.

Turkish Journal of Diabetes and Obesity (Turk J Diab Obes) is a scientific publication of Zonguldak Bulent Ecevit University Obesity and Diabetes Research and Application Center.

A Collaboration Protocol was signed between the Turkish Obesity Research Association and Zonguldak Bülent Ecevit University by taking an important step from the Turkish Journal of Diabetes and Obesity. As the Turkish Journal of Diabetes and Obesity, we are proud to open the doors of a new era in scientific publishing. With the collaboration of the Turkish Obesity Research Association and Zonguldak Bülent Ecevit University, our journal will now serve as a joint publication platform.

This is a refereed journal, which is published in printed and electronic forms. It aims at achieving free knowledge to the related national and international organizations and individuals.

This journal is published annually three times (in April, August and December).

The publication language of the journal is Turkish and English.