Research Article

Improvement of machine learning-based diabetes diagnosis via resampling techniques

Volume: 32 Number: 2 March 16, 2026
EN TR

Improvement of machine learning-based diabetes diagnosis via resampling techniques

Abstract

The objective of this study is to enhance the accuracy of diabetes diagnosis through the utilisation of machine learning techniques and resampling methods. The imbalanced nature of diabetes datasets presents a significant challenge for traditional classification algorithms, which often struggle to accurately predict results. In order to enhance the efficacy of the model, a comparative analysis was conducted to assess the performance of a range of over-sampling and under-sampling techniques, including SMOTE, ADASYN, Borderline SMOTE, SVM SMOTE, Random Under Sampler, Near Miss, One Sided Selection, Neighbourhood Cleaning Rule, Edited Nearest Neighbours, Instance Hardness Threshold, AllKNN and Tomek Links. The aforementioned techniques were then applied to the Decision Tree, Random Forest, K-Nearest Neighbours, AdaBoost, Extra Tree Classifier, and machine learning classifiers, and their performance was evaluated using the accuracy, recall, precision, F-Score, and AUC-ROC performance metrics. The SVMSMOTE resampling technique was identified as the most successful method, achieving 99.06% accuracy when used in combination with the decision tree classifier. The findings demonstrate that the incorporation of resampling techniques markedly enhances diagnostic proficiency and yields more dependable forecasts. This research makes a significant contribution to the field of medical informatics, providing a robust framework for diabetes diagnosis and offering valuable insights into the application of machine learning in healthcare.

Keywords

References

  1. [1] International Diabetes Federation. “IDF Diabetes Atlas”. https://diabetesatlas.org (11.11.2024).
  2. [2] International Diabetes Federation. “Diabetes Now Affects One in 10 Adults Worldwide,” https://idf.org/news/diabetes-now-affects-one-in-10-adults-worldwide/ (11.11.2024).
  3. [3] Özmen T, Kuzu Ü, Koçyiğit Y, Sarnel H. “Early stage diabetes prediction by features selection with metaheuristic methods”. Pamukkale University Journal of Engineering Sciences, 29(6), 596-606, 2023.
  4. [4] Pradhan N, Rani G, Dhaka VS, Poonia RC. Diabetes Prediction Using Artificial Neural Network. Editors: Basant A, Valentina EB, Lakhmi CJ, Ramesh CP. Deep Learning Techniques for Biomedical and Health Informatics. 327-339, Singapore, Springer Academic Press, 2020.
  5. [5] Maniruzzaman M, Rahman MJ, Ahammed B, Abedin MM. “Classification and prediction of diabetes disease using machine learning paradigm”. Health Information Science and Systems, 8(1), 7-14, 2020.
  6. [6] Daghistani T, Alshammari R. “Comparison of statistical logistic regression and RandomForest machine learning techniques in predicting diabetes”. Journal of Advances in Information Technology, 11(1), 78-83, 2020.
  7. [7] Shuja M, Mittal S, Zaman M. “Effective prediction of type ii diabetes mellitus using data mining classifiers and SMOTE”. Advances in Computing and Intelligent Systems: Proceedings of ICACM 2019, Singapore, 14-16 December 2020.
  8. [8] Butt UM, Letchmunan S, Ali M, Hassan FH, Baqir A, Sherazi HHR. “Machine learning based diabetes classification and prediction for healthcare applications”. Journal of healthcare engineering, 2021(1), 933-985, 2021.

Details

Primary Language

English

Subjects

Machine Learning (Other)

Journal Section

Research Article

Early Pub Date

November 2, 2025

Publication Date

March 16, 2026

Submission Date

November 26, 2024

Acceptance Date

August 20, 2025

Published in Issue

Year 2026 Volume: 32 Number: 2

APA
Şenyer Yapıcı, İ., Arslan, R., & Engin, M. A. (2026). Improvement of machine learning-based diabetes diagnosis via resampling techniques. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, 32(2), 247-258. https://doi.org/10.5505/pajes.2025.52882
AMA
1.Şenyer Yapıcı İ, Arslan R, Engin MA. Improvement of machine learning-based diabetes diagnosis via resampling techniques. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi. 2026;32(2):247-258. doi:10.5505/pajes.2025.52882
Chicago
Şenyer Yapıcı, İrem, Rukiye Arslan, and Mustafa Alptekin Engin. 2026. “Improvement of Machine Learning-Based Diabetes Diagnosis via Resampling Techniques”. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi 32 (2): 247-58. https://doi.org/10.5505/pajes.2025.52882.
EndNote
Şenyer Yapıcı İ, Arslan R, Engin MA (March 1, 2026) Improvement of machine learning-based diabetes diagnosis via resampling techniques. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi 32 2 247–258.
IEEE
[1]İ. Şenyer Yapıcı, R. Arslan, and M. A. Engin, “Improvement of machine learning-based diabetes diagnosis via resampling techniques”, Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, vol. 32, no. 2, pp. 247–258, Mar. 2026, doi: 10.5505/pajes.2025.52882.
ISNAD
Şenyer Yapıcı, İrem - Arslan, Rukiye - Engin, Mustafa Alptekin. “Improvement of Machine Learning-Based Diabetes Diagnosis via Resampling Techniques”. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi 32/2 (March 1, 2026): 247-258. https://doi.org/10.5505/pajes.2025.52882.
JAMA
1.Şenyer Yapıcı İ, Arslan R, Engin MA. Improvement of machine learning-based diabetes diagnosis via resampling techniques. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi. 2026;32:247–258.
MLA
Şenyer Yapıcı, İrem, et al. “Improvement of Machine Learning-Based Diabetes Diagnosis via Resampling Techniques”. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, vol. 32, no. 2, Mar. 2026, pp. 247-58, doi:10.5505/pajes.2025.52882.
Vancouver
1.İrem Şenyer Yapıcı, Rukiye Arslan, Mustafa Alptekin Engin. Improvement of machine learning-based diabetes diagnosis via resampling techniques. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi. 2026 Mar. 1;32(2):247-58. doi:10.5505/pajes.2025.52882