Araştırma Makalesi

The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection

Cilt: 37 Sayı: 1 27 Mart 2025
PDF İndir
EN TR

The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection

Öz

The detection of diabetes is crucial for effective management and prevention of the disease, which poses significant health risks globally. This study introduces a novel approach to diabetes detection by combining advanced data balancing techniques and feature selection methods, including Lasso (L1) regularization, to enhance the performance of predictive models in imbalanced datasets. Techniques such as Random Under Sampling (RUS), Adaptive Synthetic Sampling (ADASYN), and Synthetic Minority Over-sampling Technique (SMOTE) were employed alongside models including Random Forest (RF), CatBoost (CB), Extreme Gradient Boosting (XGB), K-Nearest Neighbors (KNN), Gaussian Naive Bayes (GNB), Logistic Regression (LR), and Gradient Boosting (GB) to assess their impact on model accuracy and generalization capabilities. The findings reveal that the RF model achieved the highest accuracy of 93.25% when utilizing the SMOTE technique, underscoring the importance of appropriate data handling strategies in improving predictive outcomes. Furthermore, when all features were utilized without selection, the RF model attained an accuracy of 95.31%, indicating the model’s capacity to capture complex patterns when feature richness is maximized. The comprehensive methodology used in the study achieved a higher accuracy in diabetes detection than research in the literature and provided important outputs for developing reliable prediction models in healthcare.

Anahtar Kelimeler

Kaynakça

  1. World Health Organization. Diabetes. Available at: https://www.who.int/en/health-topics/noncommunicable-diseases/diabetes/#tab=tab_1 [Accessed 03 September 2024].
  2. Soumya D, Srilatha B. Late stage complications of diabetes and insulin resistance. J Diabetes Metab 2011; 2(9): 1000167.
  3. Sacks DB, Bruns DE, Goldstein DE, Maclaren NK, McDonald JM, Parrott M. Guidelines and recommendations for laboratory analysis in the diagnosis and management of diabetes mellitus. Clin Chem 2002; 48(3): 436-472.
  4. American Diabetes Association. Standards of medical care in diabetes—2019 abridged for primary care providers. Clin Diabetes 2019; 37(1): 11.
  5. Harris MI, Eastman RC. Early detection of undiagnosed diabetes mellitus: a US perspective. Diabetes Metab Res Rev 2000; 16(4): 230-236.
  6. Crow H, Gage H, Hampson S, Hart J, Kimber A, Storey L, Thomas H. Measurement of satisfaction with health care: implications for practice from a systematic review of the literature. Health Technol Assess 2002; 6(32): 1-10.
  7. Sinap V. A comparative study of loan approval prediction using machine learning methods. Gazi Univ J Sci Part C: Design Technol 2024; 12(2): 644-663.
  8. Gong Y, Liu G, Xue Y, Li R, Meng L. A survey on dataset quality in machine learning. Inform Software Technol 2023; 162: 107268.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Makine Öğrenme (Diğer)

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

27 Mart 2025

Gönderilme Tarihi

25 Eylül 2024

Kabul Tarihi

24 Ocak 2025

Yayımlandığı Sayı

Yıl 2025 Cilt: 37 Sayı: 1

Kaynak Göster

APA
Sinap, V. (2025). The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 37(1), 303-320. https://doi.org/10.35234/fumbd.1556260
AMA
1.Sinap V. The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. 2025;37(1):303-320. doi:10.35234/fumbd.1556260
Chicago
Sinap, Vahid. 2025. “The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 37 (1): 303-20. https://doi.org/10.35234/fumbd.1556260.
EndNote
Sinap V (01 Mart 2025) The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 37 1 303–320.
IEEE
[1]V. Sinap, “The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection”, Fırat Üniversitesi Mühendislik Bilimleri Dergisi, c. 37, sy 1, ss. 303–320, Mar. 2025, doi: 10.35234/fumbd.1556260.
ISNAD
Sinap, Vahid. “The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi 37/1 (01 Mart 2025): 303-320. https://doi.org/10.35234/fumbd.1556260.
JAMA
1.Sinap V. The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. 2025;37:303–320.
MLA
Sinap, Vahid. “The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection”. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, c. 37, sy 1, Mart 2025, ss. 303-20, doi:10.35234/fumbd.1556260.
Vancouver
1.Vahid Sinap. The Impact of Balancing Techniques and Feature Selection on Machine Learning Models for Diabetes Detection. Fırat Üniversitesi Mühendislik Bilimleri Dergisi. 01 Mart 2025;37(1):303-20. doi:10.35234/fumbd.1556260