Araştırma Makalesi
BibTex RIS Kaynak Göster

Comparative Analysis of Diabetes Diagnosis with Machine Learning Methods

Yıl 2024, Cilt: 8 Sayı: 1, 22 - 32, 30.06.2024
https://doi.org/10.47897/bilmes.1447878

Öz

Diabetes is a disease that occurs when the body cannot regulate the level of sugar (glucose) in the blood. Early diagnosis of this disease is important in preventing more serious diseases that may arise later. Within the scope of this study, an attempt was made to optimize the diabetes data set for use by training it with different models. At the very beginning of the study, Logistic Regression, KNN, SVM (Support Vector Machine), CART (Classification and Regression Trees), RF (Random Forest), Adaboost, GBM (Gradient Boosting Machines), XGBoost (Extreme Gradient Boosting), LGBM (Light Gradient Boosting). Machine), CatBoost models were used. According to the results of the models, RF, LGBM, XGBoost accuracy, and f1 values were observed as the best models, respectively. As a result, in the Random Forest model, which produced the most successful results, Accuracy: 0.88, F1 Score: 0.84, and ROC AUC: 0.95 values were obtained, respectively.

Kaynakça

  • [1] B. Ö. Başer, M. Yangın, and E. S. Sarıdaş, "Makine öğrenmesi teknikleriyle diyabet hastalığının sınıflandırılması," Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 25, no. 1, pp. 112-120, 2021.
  • [2] W. W. H. Organization. " “Diabetes.”." https://www.who.int/news-room/fact-sheets/detail/diabetes (accessed Feb. 12, 2024).
  • [3] H. Zhou et al., "A computer simulation model of diabetes progression, quality of life, and cost," Diabetes care, vol. 28, no. 12, pp. 2856-2863, 2005.
  • [4] U. Köse, "Zeki optimizasyon tabanlı destek vektör makineleri ile diyabet teşhisi," Politeknik Dergisi, vol. 22, no. 3, pp. 557-566, 2019.
  • [5] A. D. Khare. "“Diabetes Dataset.”." https://www.kaggle.com/datasets/akshaydattatraykhare/diabetes-dataset/data (accessed Feb. 1, 2024).
  • [6] T. A. a. İ. M. Temel. "“Diagnosing Diabetes Streamlit Web Page.”." https://github.com/tubaaktas/DiabetesPred (accessed Feb. 1, 2024).
  • [7] G. Bonaccorso, "Machine learning algorithms Packt Publishing Ltd," ed: Packt Publishing Ltd, 2017.
  • [8] E. Dağdevir and M. Tokmakçı, "The Role of Feature Selection in Significant Information Extraction from EEG Signals," International Scientific and Vocational Studies Journal, vol. 5, no. 1, pp. 1-6, 2021.
  • [9] J. P. Mueller and L. Massaron, Machine learning for dummies. John Wiley & Sons, 2021.
  • [10] A. Saygılı, "Classification and Diagnostic Prediction of Breast Cancers via Different Classifiers," International Scientific and Vocational Studies Journal pp. 48-56, 2018.
  • [11] A. Saygılı and S. Varlı, "Automated diagnosis of meniscus tears from MRI of the knee," International Scientific and Vocational Studies Journal, vol. 3, no. 2, pp. 92-104, 2019.
  • [12] S. Suthaharan and S. Suthaharan, "Support vector machine," Machine learning models and algorithms for big data classification: thinking with examples for effective learning, pp. 207-235, 2016.
  • [13] W. Y. Loh, "Classification and regression trees," Wiley interdisciplinary reviews: data mining and knowledge discovery, vol. 1, no. 1, pp. 14-23, 2011.
  • [14] G. Biau and E. Scornet, "A random forest guided tour," Test, vol. 25, pp. 197-227, 2016.
  • [15] A. Natekin and A. Knoll, "Gradient boosting machines, a tutorial," Frontiers in neurorobotics, vol. 7, p. 21, 2013.
  • [16] T. Chen et al., "Xgboost: extreme gradient boosting," R package version 0.4-2, vol. 1, no. 4, pp. 1-4, 2015.
  • [17] D. D. Rufo, T. G. Debelee, A. Ibenthal, and W. G. Negera, "Diagnosis of diabetes mellitus using gradient boosting machine (LightGBM)," Diagnostics, vol. 11, no. 9, p. 1714, 2021.
  • [18] L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, "CatBoost: unbiased boosting with categorical features," Advances in neural information processing systems, vol. 31, 2018.
  • [19] M. B. Er and İ. Işık, "LSTM tabanlı derin ağlar kullanılarak diyabet hastalığı tahmini," Türk Doğa ve Fen Dergisi, vol. 10, no. 1, pp. 68-74, 2021.
  • [20] G. Harman, "Destek vektör makineleri ve naive bayes sınıflandırma algoritmalarını kullanarak diabetes mellitus tahmini," Avrupa Bilim ve Teknoloji Dergisi, no. 32, pp. 7-13, 2021.
  • [21] F. Hassan and M. E. Shaheen, "Predicting diabetes from health-based streaming data using social media, machine learning and stream processing technologies," International Journal of Engineering Research and Technology, vol. 13, no. 8, pp. 1957-1967, 2020.

Comparative Analysis of Diabetes Diagnosis with Machine Learning Methods

Yıl 2024, Cilt: 8 Sayı: 1, 22 - 32, 30.06.2024
https://doi.org/10.47897/bilmes.1447878

Öz

Diabetes is a disease that occurs when the body cannot regulate the level of sugar (glucose) in the blood. Early diagnosis of this disease is important in preventing more serious diseases that may arise later. Within the scope of this study, an attempt was made to optimize the diabetes data set for use by training it with different models. At the very beginning of the study, Logistic Regression, KNN, SVM (Support Vector Machine), CART (Classification and Regression Trees), RF (Random Forest), Adaboost, GBM (Gradient Boosting Machines), XGBoost (Extreme Gradient Boosting), LGBM (Light Gradient Boosting). Machine), CatBoost models were used. According to the results of the models, RF, LGBM, XGBoost accuracy, and f1 values were observed as the best models, respectively. As a result, in the Random Forest model, which produced the most successful results, Accuracy: 0.88, F1 Score: 0.84, and ROC AUC: 0.95 values were obtained, respectively.

Kaynakça

  • [1] B. Ö. Başer, M. Yangın, and E. S. Sarıdaş, "Makine öğrenmesi teknikleriyle diyabet hastalığının sınıflandırılması," Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 25, no. 1, pp. 112-120, 2021.
  • [2] W. W. H. Organization. " “Diabetes.”." https://www.who.int/news-room/fact-sheets/detail/diabetes (accessed Feb. 12, 2024).
  • [3] H. Zhou et al., "A computer simulation model of diabetes progression, quality of life, and cost," Diabetes care, vol. 28, no. 12, pp. 2856-2863, 2005.
  • [4] U. Köse, "Zeki optimizasyon tabanlı destek vektör makineleri ile diyabet teşhisi," Politeknik Dergisi, vol. 22, no. 3, pp. 557-566, 2019.
  • [5] A. D. Khare. "“Diabetes Dataset.”." https://www.kaggle.com/datasets/akshaydattatraykhare/diabetes-dataset/data (accessed Feb. 1, 2024).
  • [6] T. A. a. İ. M. Temel. "“Diagnosing Diabetes Streamlit Web Page.”." https://github.com/tubaaktas/DiabetesPred (accessed Feb. 1, 2024).
  • [7] G. Bonaccorso, "Machine learning algorithms Packt Publishing Ltd," ed: Packt Publishing Ltd, 2017.
  • [8] E. Dağdevir and M. Tokmakçı, "The Role of Feature Selection in Significant Information Extraction from EEG Signals," International Scientific and Vocational Studies Journal, vol. 5, no. 1, pp. 1-6, 2021.
  • [9] J. P. Mueller and L. Massaron, Machine learning for dummies. John Wiley & Sons, 2021.
  • [10] A. Saygılı, "Classification and Diagnostic Prediction of Breast Cancers via Different Classifiers," International Scientific and Vocational Studies Journal pp. 48-56, 2018.
  • [11] A. Saygılı and S. Varlı, "Automated diagnosis of meniscus tears from MRI of the knee," International Scientific and Vocational Studies Journal, vol. 3, no. 2, pp. 92-104, 2019.
  • [12] S. Suthaharan and S. Suthaharan, "Support vector machine," Machine learning models and algorithms for big data classification: thinking with examples for effective learning, pp. 207-235, 2016.
  • [13] W. Y. Loh, "Classification and regression trees," Wiley interdisciplinary reviews: data mining and knowledge discovery, vol. 1, no. 1, pp. 14-23, 2011.
  • [14] G. Biau and E. Scornet, "A random forest guided tour," Test, vol. 25, pp. 197-227, 2016.
  • [15] A. Natekin and A. Knoll, "Gradient boosting machines, a tutorial," Frontiers in neurorobotics, vol. 7, p. 21, 2013.
  • [16] T. Chen et al., "Xgboost: extreme gradient boosting," R package version 0.4-2, vol. 1, no. 4, pp. 1-4, 2015.
  • [17] D. D. Rufo, T. G. Debelee, A. Ibenthal, and W. G. Negera, "Diagnosis of diabetes mellitus using gradient boosting machine (LightGBM)," Diagnostics, vol. 11, no. 9, p. 1714, 2021.
  • [18] L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, "CatBoost: unbiased boosting with categorical features," Advances in neural information processing systems, vol. 31, 2018.
  • [19] M. B. Er and İ. Işık, "LSTM tabanlı derin ağlar kullanılarak diyabet hastalığı tahmini," Türk Doğa ve Fen Dergisi, vol. 10, no. 1, pp. 68-74, 2021.
  • [20] G. Harman, "Destek vektör makineleri ve naive bayes sınıflandırma algoritmalarını kullanarak diabetes mellitus tahmini," Avrupa Bilim ve Teknoloji Dergisi, no. 32, pp. 7-13, 2021.
  • [21] F. Hassan and M. E. Shaheen, "Predicting diabetes from health-based streaming data using social media, machine learning and stream processing technologies," International Journal of Engineering Research and Technology, vol. 13, no. 8, pp. 1957-1967, 2020.
Toplam 21 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Görüntü İşleme, Makine Öğrenme (Diğer)
Bölüm Makaleler
Yazarlar

Tuğba Aktaş Bu kişi benim 0009-0005-0580-7502

İsmail Mert Temel Bu kişi benim 0009-0008-7989-9747

Ahmet Saygılı 0000-0001-8625-4842

Yayımlanma Tarihi 30 Haziran 2024
Gönderilme Tarihi 6 Mart 2024
Kabul Tarihi 5 Mayıs 2024
Yayımlandığı Sayı Yıl 2024 Cilt: 8 Sayı: 1

Kaynak Göster

APA Aktaş, T., Temel, İ. M., & Saygılı, A. (2024). Comparative Analysis of Diabetes Diagnosis with Machine Learning Methods. International Scientific and Vocational Studies Journal, 8(1), 22-32. https://doi.org/10.47897/bilmes.1447878
AMA Aktaş T, Temel İM, Saygılı A. Comparative Analysis of Diabetes Diagnosis with Machine Learning Methods. ISVOS. Haziran 2024;8(1):22-32. doi:10.47897/bilmes.1447878
Chicago Aktaş, Tuğba, İsmail Mert Temel, ve Ahmet Saygılı. “Comparative Analysis of Diabetes Diagnosis With Machine Learning Methods”. International Scientific and Vocational Studies Journal 8, sy. 1 (Haziran 2024): 22-32. https://doi.org/10.47897/bilmes.1447878.
EndNote Aktaş T, Temel İM, Saygılı A (01 Haziran 2024) Comparative Analysis of Diabetes Diagnosis with Machine Learning Methods. International Scientific and Vocational Studies Journal 8 1 22–32.
IEEE T. Aktaş, İ. M. Temel, ve A. Saygılı, “Comparative Analysis of Diabetes Diagnosis with Machine Learning Methods”, ISVOS, c. 8, sy. 1, ss. 22–32, 2024, doi: 10.47897/bilmes.1447878.
ISNAD Aktaş, Tuğba vd. “Comparative Analysis of Diabetes Diagnosis With Machine Learning Methods”. International Scientific and Vocational Studies Journal 8/1 (Haziran 2024), 22-32. https://doi.org/10.47897/bilmes.1447878.
JAMA Aktaş T, Temel İM, Saygılı A. Comparative Analysis of Diabetes Diagnosis with Machine Learning Methods. ISVOS. 2024;8:22–32.
MLA Aktaş, Tuğba vd. “Comparative Analysis of Diabetes Diagnosis With Machine Learning Methods”. International Scientific and Vocational Studies Journal, c. 8, sy. 1, 2024, ss. 22-32, doi:10.47897/bilmes.1447878.
Vancouver Aktaş T, Temel İM, Saygılı A. Comparative Analysis of Diabetes Diagnosis with Machine Learning Methods. ISVOS. 2024;8(1):22-3.


Creative Commons License
Creative Commons Atıf 4.0 It is licensed under an International License