Research Article
BibTex RIS Cite

Gradient Boosting Classification kullanarak Diabetes Mellitus Tahmini

Year 2020, Ejosat Special Issue 2020 (ICCEES), 268 - 272, 05.10.2020
https://doi.org/10.31590/ejosat.803504

Abstract

Diyabet, dünya çapında yaygın ve endemik bir sağlık sorunu haline gelmiştir. Bu hastalık, kronik ve ayrıca yaşamı tehdit eden bir hastalıktır. Kalp, böbrekler, gözler, sinirler ve kan damarları gibi birçok organda sağlık sorununa yol açabilir. Diyabet kaynaklı ölüm oranını azaltmak için erken önleme tekniklerine ihtiyaç duyulmaktadır. Günümüzde makine öğrenmesi teknikleri kanser, diyabet, kalp hastalıkları, tiroid vb. gibi hayatı tehdit eden farklı hastalıkları tahmin etmek veya tespit etmek için kullanılmaktadır. Bu çalışmada Pima Indian veri setini kullanarak bir şeker hastalığı tahmin modeli sunulmuştur. Çalışmada şeker hastalığını tahmin etmek için Karar Ağacı (KA), Rastgele Orman (RO) ve Gradyan Artırma (GA) algoritmaları olmak üzere üç farklı makine öğrenmesi tekniği uygulanmış ve performans analizi yapılmıştır. Karmaşıklık matrisi, doğruluk, F1 skoru, kesinlik, geri çağırma, Cohen'in kappa'sı değerlendirilmiş ve ayrıca ROC eğrisi çizdirilmiştir. Üç teknikten, GA ile en iyi sonuçlar elde edilmiştir.

References

  • Kerner, W., & Brückel, J. (2014). Definition, classification and diagnosis of diabetes mellitus. Experimental and clinical endocrinology & diabetes, 122(07), 384-386.
  • Mellitus, D. (2005). Diagnosis and classification of diabetes mellitus. Diabetes care, 28(S37), S5-S10.
  • Priyadi, Akhmad, et al. (2019). An economic evaluation of diabetes mellitus management in South East Asia. Journal of Advanced Pharmacy Education & Research| Apr-Jun 9.2
  • Chan, J. C., Malik, V., Jia, W., Kadowaki, T., Yajnik, C. S., Yoon, K. H., & Hu, F. B. (2009). Diabetes in Asia: epidemiology, risk factors, and pathophysiology. Jama, 301(20), 2129-2140.
  • Latif, Z. A., Ashrafuzzaman, S. M., Amin, M. F., Gadekar, A. V., Sobhan, M. J., & Haider, T. (2017). A Cross-sectional Study to Evaluate Diabetes Management, Control and Complications in Patients with type 2 Diabetes in Bangladesh. BIRDEM Medical Journal, 7(1), 17-27.
  • Wild, S., Roglic, G., Green, A., Sicree, R., & King, H. (2004). Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diabetes care, 27(5), 1047-1053.
  • kumar Dewangan, A., & Agrawal, P. (2015). Classification of diabetes mellitus using machine learning techniques. International Journal of Engineering and Applied Sciences, 2(5).
  • Karthikeyani, V., & Begum, I. P. (2013). Comparison a performance of data mining algorithms (CPDMA) in prediction of diabetes disease. International journal on computer science and engineering, 5(3), 205.
  • Parashar, A., Burse, K., & Rawat, K. (2014). A Comparative approach for Pima Indians diabetes diagnosis using lda-support vector machine and feed forward neural network. International Journal of Advanced Research in Computer Science and Software Engineering, 4(11), 378-383.
  • Al Helal, M., Chowdhury, A. I., Islam, A., Ahmed, E., Mahmud, M. S., & Hossain, S. (2019, February). An optimization approach to improve classification performance in cancer and diabetes prediction. In 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) (pp. 1-5). IEEE.
  • Dataset, P. I. D. UCI Machine Learning Repository, diambil dari http://archive. ics. uci. edu/ml/datasets. Pima+ Indians+ Diabetes.
  • Song, Y. Y., & Ying, L. U. (2015). Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry, 27(2), 130.
  • Fawagreh, K., Gaber, M. M., & Elyan, E. (2014). Random forests: from early developments to recent advancements. Systems Science & Control Engineering: An Open Access Journal, 2(1), 602-609.
  • Breiman, L. (June 1997). Arcing The Edge (PDF). Technical Report 486. Statistics Department, University of California, Berkeley.
  • Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information processing & management, 45(4), 427-437.
  • https://towardsdatascience.com/cohens-kappa-9786ceceab58

Prediction of Diabetes Mellitus by using Gradient Boosting Classification

Year 2020, Ejosat Special Issue 2020 (ICCEES), 268 - 272, 05.10.2020
https://doi.org/10.31590/ejosat.803504

Abstract

Diabetes has become a pervasive and endemic health problem worldwide. It is a chronic disease and also life-threatening. It can cause health problems in many organs such as the heart, kidneys, eyes, nerves, and blood vessels. To reduce the fatality rate from diabetes, early prevention techniques are needed. Nowadays, machine learning techniques are used to predict or detect different life-threatening diseases like cancer, diabetes, heart diseases, thyroid, etc. In this study, a prediction model of diabetes mellitus was presented using the Pima Indian dataset. Three different machine learning techniques that Decision Tree (DT), Random Forest (RF) and, Gradient Boosting (GB) algorithm were used to predict diabetes mellitus and the performance analysis was performed. Confusion matrix, accuracy, F1 score, precision, recall, Cohen’s kappa were evaluated and also a ROC curve was plotted. Out of the three techniques, the best results have been achieved with GB.

References

  • Kerner, W., & Brückel, J. (2014). Definition, classification and diagnosis of diabetes mellitus. Experimental and clinical endocrinology & diabetes, 122(07), 384-386.
  • Mellitus, D. (2005). Diagnosis and classification of diabetes mellitus. Diabetes care, 28(S37), S5-S10.
  • Priyadi, Akhmad, et al. (2019). An economic evaluation of diabetes mellitus management in South East Asia. Journal of Advanced Pharmacy Education & Research| Apr-Jun 9.2
  • Chan, J. C., Malik, V., Jia, W., Kadowaki, T., Yajnik, C. S., Yoon, K. H., & Hu, F. B. (2009). Diabetes in Asia: epidemiology, risk factors, and pathophysiology. Jama, 301(20), 2129-2140.
  • Latif, Z. A., Ashrafuzzaman, S. M., Amin, M. F., Gadekar, A. V., Sobhan, M. J., & Haider, T. (2017). A Cross-sectional Study to Evaluate Diabetes Management, Control and Complications in Patients with type 2 Diabetes in Bangladesh. BIRDEM Medical Journal, 7(1), 17-27.
  • Wild, S., Roglic, G., Green, A., Sicree, R., & King, H. (2004). Global prevalence of diabetes: estimates for the year 2000 and projections for 2030. Diabetes care, 27(5), 1047-1053.
  • kumar Dewangan, A., & Agrawal, P. (2015). Classification of diabetes mellitus using machine learning techniques. International Journal of Engineering and Applied Sciences, 2(5).
  • Karthikeyani, V., & Begum, I. P. (2013). Comparison a performance of data mining algorithms (CPDMA) in prediction of diabetes disease. International journal on computer science and engineering, 5(3), 205.
  • Parashar, A., Burse, K., & Rawat, K. (2014). A Comparative approach for Pima Indians diabetes diagnosis using lda-support vector machine and feed forward neural network. International Journal of Advanced Research in Computer Science and Software Engineering, 4(11), 378-383.
  • Al Helal, M., Chowdhury, A. I., Islam, A., Ahmed, E., Mahmud, M. S., & Hossain, S. (2019, February). An optimization approach to improve classification performance in cancer and diabetes prediction. In 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE) (pp. 1-5). IEEE.
  • Dataset, P. I. D. UCI Machine Learning Repository, diambil dari http://archive. ics. uci. edu/ml/datasets. Pima+ Indians+ Diabetes.
  • Song, Y. Y., & Ying, L. U. (2015). Decision tree methods: applications for classification and prediction. Shanghai archives of psychiatry, 27(2), 130.
  • Fawagreh, K., Gaber, M. M., & Elyan, E. (2014). Random forests: from early developments to recent advancements. Systems Science & Control Engineering: An Open Access Journal, 2(1), 602-609.
  • Breiman, L. (June 1997). Arcing The Edge (PDF). Technical Report 486. Statistics Department, University of California, Berkeley.
  • Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information processing & management, 45(4), 427-437.
  • https://towardsdatascience.com/cohens-kappa-9786ceceab58
There are 16 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Fatema Nusrat 0000-0001-8495-4925

Betül Uzbaş 0000-0002-0255-5988

Ömer Kaan Baykan 0000-0001-5890-510X

Publication Date October 5, 2020
Published in Issue Year 2020 Ejosat Special Issue 2020 (ICCEES)

Cite

APA Nusrat, F., Uzbaş, B., & Baykan, Ö. K. (2020). Prediction of Diabetes Mellitus by using Gradient Boosting Classification. Avrupa Bilim Ve Teknoloji Dergisi268-272. https://doi.org/10.31590/ejosat.803504