Research Article

Diabetes Risk Prediction with Machine Learning Models

Volume: 2 Number: 2 October 1, 2022
EN

Diabetes Risk Prediction with Machine Learning Models

Abstract

Diabetes mellitus (DM) is one of the most common chronic diseases worldwide, which is a major public health problem. The aim of this study is to predict DM risk with machine learning (ML) models using available data. In the analytical study, the “Diabetes Health Indicators Dataset” consisting of 253680 data and 21 variables collected annually by the CDC was used. The open access dataset was retrieved from Kaggle on March 5, 2022. Data analysis was done with Phyton 3.0 programming language using numpy, pandas, matplotlib, seaborn, sciktlearn, imblearn libraries. With data pre-processing, outliers and missing data were removed. KNN, Logistic regression, Decision tree, Random forest and Naive Bayes from ML algorithms were used in predictive modeling. The prediction rate of the algorithms was evaluated with accuracy, precision, recall and F1 Score. It did not require permission as the data was open access. KNN’s accuracy was 0.74, precision 0.31, recall 0.55, F1 score 0.39; Logistic regression’s accuracy was 0.72; precision 0.33, recall 0.74, F1 score 0.46; Decision tree’s was accuracy 0.84, precision 0.54 recall 0.15, F1 score 0.24; Random forest’s accuracy was 0.84, precision 0.56, recall 0.16, F1 score 0.25; Naive bayes's accuracy was 0.84, precision 0.52, recall 0.19, F1 score 0.28. In this study, ML algorithms were used for DM risk estimation. According to the experimental results, when the data set is divided into random training (80%) and testing (20%), the accuracy values of random forest and decision tree algorithms are very close to each other (RF: 0.848, DT: 0.847). Therefore, it can be said that the two best algorithms for diabetes risk estimation are random forest and decision tree.

Keywords

References

  1. [1] World Health Organisation (WHO). 2016. “Global report on diabetes” https://apps.who.int/iris/bitstream/handle/10665/204871/9789241565257_eng.pdf Accessed: 17 December 2021.
  2. [2] Türkiye Endokrinoloji ve Metabolizma Derneği (TEMD). Diabetes Mellitus Çalışma ve Eğitim Grubu. “Diabetes Mellitus ve Komplikasyonlarının Tanı, Tedavi ve İzlem Klavuzu 2019”. https://temd.org.tr/admin/uploads/tbl_kilavuz/20190819095854-2019tbl_kilavuzb48da47363.pdf Accessed: 27 December 2021.
  3. [3] Tekir, O., Çevik, C., Kaymak, G. Ö., & Kaya, A. (2021). The Effect of Diabetes Symptoms on Quality of Life in Individuals with Type 2 Diabetes. Acta Endocrinologica (Bucharest), 17(2), 186.
  4. [4] TÜRKDİAB (2019). Diyabet Tanı ve Tedavi Rehberi. Güncellenmiş 9. Baskı. Armoni Nüans Baskı Sanatları A.Ş. İstanbul, s. 16.
  5. [5] World Health Organization (2019). Classification of Diabetes Mellitus 2019. ISBN: 9789241515702.
  6. [6] Guo, Y., Zhao, J., Wang, H., Liu, S., Huang, T., & Chang, G. (2020). Metabolic disorder-related hypertension. In Secondary hypertension (pp. 507-545). Springer, Singapore.
  7. [7] Saeedi, P., Petersohn, I., Salpea, P., Malanda, B., Karuranga, S., Unwin, N., ... & IDF Diabetes Atlas Committee. (2019). Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas. Diabetes research and clinical practice, 157, 107843.
  8. [8] International Diabetes Federation (IDF). “IDF Diabetes Atlas 10th Edition 2021”. https://diabetesatlas.org/idfawp/resource-files/2021/07/IDF_Atlas_10th_Edition_2021.pdf Son erişim tarihi: 27 Aralık 2021.

Details

Primary Language

English

Subjects

Clinical Sciences

Journal Section

Research Article

Publication Date

October 1, 2022

Submission Date

June 8, 2022

Acceptance Date

August 24, 2022

Published in Issue

Year 2022 Volume: 2 Number: 2

APA
Özsezer, G., & Mermer, G. (2022). Diabetes Risk Prediction with Machine Learning Models. Artificial Intelligence Theory and Applications, 2(2), 1-9. https://izlik.org/JA88DF97RS
AMA
1.Özsezer G, Mermer G. Diabetes Risk Prediction with Machine Learning Models. AITA. 2022;2(2):1-9. https://izlik.org/JA88DF97RS
Chicago
Özsezer, Gözde, and Gülengül Mermer. 2022. “Diabetes Risk Prediction With Machine Learning Models”. Artificial Intelligence Theory and Applications 2 (2): 1-9. https://izlik.org/JA88DF97RS.
EndNote
Özsezer G, Mermer G (October 1, 2022) Diabetes Risk Prediction with Machine Learning Models. Artificial Intelligence Theory and Applications 2 2 1–9.
IEEE
[1]G. Özsezer and G. Mermer, “Diabetes Risk Prediction with Machine Learning Models”, AITA, vol. 2, no. 2, pp. 1–9, Oct. 2022, [Online]. Available: https://izlik.org/JA88DF97RS
ISNAD
Özsezer, Gözde - Mermer, Gülengül. “Diabetes Risk Prediction With Machine Learning Models”. Artificial Intelligence Theory and Applications 2/2 (October 1, 2022): 1-9. https://izlik.org/JA88DF97RS.
JAMA
1.Özsezer G, Mermer G. Diabetes Risk Prediction with Machine Learning Models. AITA. 2022;2:1–9.
MLA
Özsezer, Gözde, and Gülengül Mermer. “Diabetes Risk Prediction With Machine Learning Models”. Artificial Intelligence Theory and Applications, vol. 2, no. 2, Oct. 2022, pp. 1-9, https://izlik.org/JA88DF97RS.
Vancouver
1.Gözde Özsezer, Gülengül Mermer. Diabetes Risk Prediction with Machine Learning Models. AITA [Internet]. 2022 Oct. 1;2(2):1-9. Available from: https://izlik.org/JA88DF97RS