Research Article
BibTex RIS Cite

Makine Öğrenimi Yöntemleriyle Erken Evre Diyabet Tahmini

Year 2021, Issue: 29, 52 - 57, 01.12.2021
https://doi.org/10.31590/ejosat.1015816

Abstract

Diyabet, tedavisi olmayan, yaygın ve ölümcül bir hastalıktır. Milyonlarca insan diyabet hastasıdır ve bu hastalık hayatlarını doğrudan etkilemektedir. Erken tedavi sayesinde diyabetin etkilerini azaltmak ve hastaların hayat standartlarını arttırmak mümkün olsa da çoğunlukla teşhis konulması yıllar sürebilen bir süreçtir. Diyabetin erken teşhisi için mevcut hastaların verileri kullanılarak makine öğrenmesi uygulanabilir. Bu sayede kan testi, glukoz ölçümü veya bu gibi herhangi bir tıbbi işleme gerek kalmadan diyabet teşhisi konulabilecek, diyabete yakalanma riski olan kişiler saptanabilecektir. Bu yaklaşımla diyabet teşhisinde kullanılabilecek bir makine öğrenmesi modeli geliştirmek çalışmanın konusunu oluşturmaktadır. Sunulan çalışmada 520 hastanın 16 farklı kategoride verisi işlenerek oluşturulan diyabet veri seti üzerinde sekiz makine öğrenmesi yaklaşımı uygulanmış, performans kıyaslaması 10 katlamalı çapraz doğrulama ile doğruluk, kesinlik, duyarlılık ve f skoru metrikleri ile ölçümlenmiştir. Ek olarak veri setinde yer alan özelliklerin diyabet teşhisindeki anlam önceliği araştırılmıştır. Geliştirilen modellerin hepsi belli düzeyde başarı oranını yakalamıştır. En düşük doğruluk oranı %88.82 sınıflandırma başarımı ile basit bir makine öğrenmesi tekniği olan Naive Bayes tekniği kullanılarak elde edilmiştir. En iyi sonuç 1 boyutlu evrişimsel sinir ağı ile elde edilmiştir. Evrişimsel sinir ağı kullanılarak elde edilen modelin doğruluğu %99.04, kesinliği %100, hassasiyet oranı %98.63 ve f skoru %99.31 olarak ölçülmüştür. Elde edilen sonuçlar, geliştirilen sınıflandırmanın diyabet teşhisinde bir soru seti olarak kullanılabileceğini göstermektedir.

References

  • Ampadu, H. (2021, May 01). Random Forests Understanding. AI Pool. https://ai-pool.com/a/s/random-forests-understanding
  • Berkley, C. (2021, May 18). How Is Rapid Weight Loss Related to Diabetes. Verywell Health. https://www.verywellhealth.com/rapid-weight-loss-5101064
  • Bilgin, G. (2021). Makine Öğrenmesi Algoritmaları Kullanarak Erken Dönemde Diyabet Hastalığı Riskinin Araştırılması. Zeki Sistemler Teori ve Uygulamaları Dergisi, 4(1), 55-64. https://doi.org/10.46387/bjesr.790225
  • Cirino, E. (2019, July 6). What Causes Muscle Rigidity. Healthline. https://www.healthline.com/health/muscle-rigidity
  • Coelho, S. (2021, April 28). What Is Blurred Vision. Verywell Health. https://www.verywellhealth.com/blurred-vision-5114184
  • Draelos, R. (2019). Measuring Performance: The Confusion Matrix. Glass Box Medicine. https://glassboxmedicine.com/2019/02/17/measuring-performance-the-confusion-matrix/
  • Harris, M. I., Klein, R., Welborn, T. A. & Knuiman, M. W. (1992). Onset of NIDDM occurs at least 4–7 yr before clinical diagnosis. Diabetes Care, 15(7), 815-819. DOI: 10.2337/diacare.15.7.815
  • Hawkins, D. M., Subhash, C. B. & Mills, D. (2003). Assessing Model Fit by Cross-Validation. Journal of Chemical Information and Computer Sciences, 43(2), 579–586. https://doi.org/10.1021/ci025626i
  • Hickman, R. J. (2020, July 28). What Is Polydipsia. Verywell Health. https://www.verywellhealth.com/polydipsia-4783881
  • IBM Cloud Education. (2020, July 15). What is machine learning. IBM. https://www.ibm.com/cloud/learn/machine-learning
  • Islam, M. M., Ferdousi, R., Rahman, S. & Bushra, H. Y. (2020). Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques. Computer Vision and Machine Intelligence in Medical Image Analysis, 113-125. DOI:10.1007/978-981-13-8798-2_12
  • Jones, H. (2021, April 5). Causes of Polyphagia. Verywell Health. verywellhealth.com/polyphagia-5114624
  • Le, T. M., Vo, T. M., Pham, T. N. & Dao, S. V. T. (2020). A Novel Wrapper–Based Feature Selection for Early Diabetes Prediction Enhanced With a Metaheuristic. IEEE Access, 9, 7869-7884. DOI:10.1109/ACCESS.2020.3047942
  • Nahzat, S , Yağanoğlu, M . (2021). Diabetes Prediction Using Machine Learning Classification Algorithms . Avrupa Bilim ve Teknoloji Dergisi , Ejosat Özel Sayı 2021 (ARACONF) , 53-59 . DOI: 10.31590/ejosat.899716
  • Oladimeji, O. O., Oladimeji, A. & Oladimeji, O. (2021). Classification Models for Likelihood Prediction of Diabetes at Early Stage Using Feature Selection. Applied Computing and Informatics. https://doi.org/10.1108/ACI-01-2021-0022
  • Oleiwi, A. K., Shi, L., Tao, Y. & Wei, L. (2020). A Comparative Analysis and Risk Prediction of Diabetes at Early Stage using Machine Learning Approach. International Journal of Future Generation Communication and Networking, 13(3), 4151-4163.
  • Özer, İ. (2020). Uzun Kısa Dönem Bellek Ağlarını Kullanarak Erken Aşama Diyabet Tahmini. Mühendislik Bilimleri ve Araştırmaları Dergisi, 2(2), 50-57. https://doi.org/10.38016/jista.877292
  • Petrie, T. (2021, June 07). What Is Paresis. Verywell Health. https://www.verywellhealth.com/paresis-5184820
  • Ramachandran, A. & Chamukuttan, S. (2008). Early Diagnosis and Prevention of Diabetes in Developing Countries. Reviews in Endocrine and Metabolic Disorders, 9(3), 193-201. DOI: 10.1007/s11154-008-9079-z
  • Rish, I. (2001). An Empirical Study of the Naïve Bayes Classifier. IJCAI Workshop on Empirical Methods in AI, 3(22). 41-46.
  • Sadhu, A. & Jadli, A. (2021). Early-Stage Diabetes Risk Prediction: A Comparative Analysis of Classification Algorithms. International Advanced Research Journal in Science, Engineering and Technology (IARJSET), 8(2), 193-201. DOI: 10.17148/IARJSET.2021.8228
  • Thrush. (2019, January 15). Diabetes UK. https://www.diabetes.co.uk/diabetes-complications/diabetes-and-yeast-infections.html.
  • UCI Machine Learning Repository. (2020, July 12). Early stage diabetes risk prediction dataset. https://archive.ics.uci.edu/ml/datasets/Early+stage+diabetes+risk+prediction+dataset.
  • U.S. Department of Health & Human Services. (2004, January 12). Diabetes: A National Plan For Action. The Importance Of Early Diabetes Detection. https://aspe.hhs.gov/report/diabetes-national-plan-action/importance-early-diabetes-detection
  • Watson, S. (2018, September 29). Does Diabetes Cause Hair Loss. Healthline. https://www.healthline.com/health/does-diabetes-cause-hair-loss
  • WHO. (n.d.). Diabetes. Retrieved July 15, 2021, from https://www.who.int/health-topics/diabetes
  • Wood, T. (n.d.). What is a Random Forest. DeepAI. Retrieved August 01, 2021, from https://deepai.org/machine-learning-glossary-and-terms/random-forest

Early Stage Diabetes Prediction Using Machine Learning Methods

Year 2021, Issue: 29, 52 - 57, 01.12.2021
https://doi.org/10.31590/ejosat.1015816

Abstract

Diabetes is a common disease that is incurable and fatal. Millions of people worldwide have diabetes and it directly affects people’s lives. Early diagnosis helps reduce the effects of diabetes and improve the life quality of patients, but in common case people live with diabetes for years before getting diagnosed. Early diagnosis can be done by applying machine learning methods on existing data of patients. In this way, people can quickly get diagnosed without taking a glucose screening test or any blood test. Answering a simple question set would be enough to determine if a person is diabetic or has a risk of being diabetic. In the proposed study, determination of diabetes is performed by machine learning techniques. In this scope, a publicly available diabetes dataset, which includes 16 features that are collected from 520 people, was used to create predictive models. Eight machine learning methods were individually performed over the dataset. The results of each model were validated by using a 10 fold cross validation schema. Addition to accuracy metric, confusion matrix based other performance metrics; precision, recall and f1 score, were also reported. All of the created models resulted in high accuracy scores. The minimum accuracy score was measured as 88.85% by using one of the basic machine learning techniques, Naive Bayes. The highest accuracy rate was 99.04%, which is obtained by using a one dimensional convolutional neural network model. The designed Convolutional Neural Network model also resulted in highest performance scores for other metrics as 100.00%, 98.63% and 99.31% for precision, recall and f1 scores, respectively. These findings indicate that the created 1D CNN model can be utilized in the determination of diabetic patients by asking only several questions to patients.

References

  • Ampadu, H. (2021, May 01). Random Forests Understanding. AI Pool. https://ai-pool.com/a/s/random-forests-understanding
  • Berkley, C. (2021, May 18). How Is Rapid Weight Loss Related to Diabetes. Verywell Health. https://www.verywellhealth.com/rapid-weight-loss-5101064
  • Bilgin, G. (2021). Makine Öğrenmesi Algoritmaları Kullanarak Erken Dönemde Diyabet Hastalığı Riskinin Araştırılması. Zeki Sistemler Teori ve Uygulamaları Dergisi, 4(1), 55-64. https://doi.org/10.46387/bjesr.790225
  • Cirino, E. (2019, July 6). What Causes Muscle Rigidity. Healthline. https://www.healthline.com/health/muscle-rigidity
  • Coelho, S. (2021, April 28). What Is Blurred Vision. Verywell Health. https://www.verywellhealth.com/blurred-vision-5114184
  • Draelos, R. (2019). Measuring Performance: The Confusion Matrix. Glass Box Medicine. https://glassboxmedicine.com/2019/02/17/measuring-performance-the-confusion-matrix/
  • Harris, M. I., Klein, R., Welborn, T. A. & Knuiman, M. W. (1992). Onset of NIDDM occurs at least 4–7 yr before clinical diagnosis. Diabetes Care, 15(7), 815-819. DOI: 10.2337/diacare.15.7.815
  • Hawkins, D. M., Subhash, C. B. & Mills, D. (2003). Assessing Model Fit by Cross-Validation. Journal of Chemical Information and Computer Sciences, 43(2), 579–586. https://doi.org/10.1021/ci025626i
  • Hickman, R. J. (2020, July 28). What Is Polydipsia. Verywell Health. https://www.verywellhealth.com/polydipsia-4783881
  • IBM Cloud Education. (2020, July 15). What is machine learning. IBM. https://www.ibm.com/cloud/learn/machine-learning
  • Islam, M. M., Ferdousi, R., Rahman, S. & Bushra, H. Y. (2020). Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques. Computer Vision and Machine Intelligence in Medical Image Analysis, 113-125. DOI:10.1007/978-981-13-8798-2_12
  • Jones, H. (2021, April 5). Causes of Polyphagia. Verywell Health. verywellhealth.com/polyphagia-5114624
  • Le, T. M., Vo, T. M., Pham, T. N. & Dao, S. V. T. (2020). A Novel Wrapper–Based Feature Selection for Early Diabetes Prediction Enhanced With a Metaheuristic. IEEE Access, 9, 7869-7884. DOI:10.1109/ACCESS.2020.3047942
  • Nahzat, S , Yağanoğlu, M . (2021). Diabetes Prediction Using Machine Learning Classification Algorithms . Avrupa Bilim ve Teknoloji Dergisi , Ejosat Özel Sayı 2021 (ARACONF) , 53-59 . DOI: 10.31590/ejosat.899716
  • Oladimeji, O. O., Oladimeji, A. & Oladimeji, O. (2021). Classification Models for Likelihood Prediction of Diabetes at Early Stage Using Feature Selection. Applied Computing and Informatics. https://doi.org/10.1108/ACI-01-2021-0022
  • Oleiwi, A. K., Shi, L., Tao, Y. & Wei, L. (2020). A Comparative Analysis and Risk Prediction of Diabetes at Early Stage using Machine Learning Approach. International Journal of Future Generation Communication and Networking, 13(3), 4151-4163.
  • Özer, İ. (2020). Uzun Kısa Dönem Bellek Ağlarını Kullanarak Erken Aşama Diyabet Tahmini. Mühendislik Bilimleri ve Araştırmaları Dergisi, 2(2), 50-57. https://doi.org/10.38016/jista.877292
  • Petrie, T. (2021, June 07). What Is Paresis. Verywell Health. https://www.verywellhealth.com/paresis-5184820
  • Ramachandran, A. & Chamukuttan, S. (2008). Early Diagnosis and Prevention of Diabetes in Developing Countries. Reviews in Endocrine and Metabolic Disorders, 9(3), 193-201. DOI: 10.1007/s11154-008-9079-z
  • Rish, I. (2001). An Empirical Study of the Naïve Bayes Classifier. IJCAI Workshop on Empirical Methods in AI, 3(22). 41-46.
  • Sadhu, A. & Jadli, A. (2021). Early-Stage Diabetes Risk Prediction: A Comparative Analysis of Classification Algorithms. International Advanced Research Journal in Science, Engineering and Technology (IARJSET), 8(2), 193-201. DOI: 10.17148/IARJSET.2021.8228
  • Thrush. (2019, January 15). Diabetes UK. https://www.diabetes.co.uk/diabetes-complications/diabetes-and-yeast-infections.html.
  • UCI Machine Learning Repository. (2020, July 12). Early stage diabetes risk prediction dataset. https://archive.ics.uci.edu/ml/datasets/Early+stage+diabetes+risk+prediction+dataset.
  • U.S. Department of Health & Human Services. (2004, January 12). Diabetes: A National Plan For Action. The Importance Of Early Diabetes Detection. https://aspe.hhs.gov/report/diabetes-national-plan-action/importance-early-diabetes-detection
  • Watson, S. (2018, September 29). Does Diabetes Cause Hair Loss. Healthline. https://www.healthline.com/health/does-diabetes-cause-hair-loss
  • WHO. (n.d.). Diabetes. Retrieved July 15, 2021, from https://www.who.int/health-topics/diabetes
  • Wood, T. (n.d.). What is a Random Forest. DeepAI. Retrieved August 01, 2021, from https://deepai.org/machine-learning-glossary-and-terms/random-forest
There are 27 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Özge Nur Ergün 0000-0002-9997-0853

Hamza O.ilhan This is me 0000-0002-1753-2703

Early Pub Date December 15, 2021
Publication Date December 1, 2021
Published in Issue Year 2021 Issue: 29

Cite

APA Ergün, Ö. N., & O.ilhan, H. (2021). Early Stage Diabetes Prediction Using Machine Learning Methods. Avrupa Bilim Ve Teknoloji Dergisi(29), 52-57. https://doi.org/10.31590/ejosat.1015816