Araştırma Makalesi

Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data

Cilt: 16 Sayı: 2 1 Haziran 2026
PDF İndir
TR EN

Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data

Öz

Cardiovascular diseases (CVD) are a leading global cause of death and morbidity. This study evaluates data balancing techniques (SMOTE, ENN, SMOTE-ENN, SMOTE-Tomek) and machine learning (ML) algorithms for predicting CVD risk using big data. The 2021 CDC BRFSS dataset, with 308,854 records, was preprocessed by removing missing and irrelevant data. The dataset was split into 80% training and 20% testing subsets. ML models, including logistic regression, random forest, LightGBM, XGBoost, and CatBoost, were trained on balanced data. Performance metrics such as accuracy, precision, recall, F1 score, ROC curve, and AUC were used for evaluation. SMOTE-ENN and SMOTE-Tomek improved model performance, with LightGBM and CatBoost achieving the highest AUC and F1 scores. Results demonstrate that data balancing, especially SMOTE-ENN, enhances model sensitivity, aiding CVD risk identification. These findings underscore the potential for ML in nursing to develop targeted interventions and improve outcomes.

Anahtar Kelimeler

Kaynakça

  1. Ali, Z. A., Abduljabbar, Z. H., Taher, H. A., Sallow, A. B., & Almufti, S. M. (2023). Exploring the power of eXtreme gradient boosting algorithm in machine learning: A review. Academic Journal of Nawroz University, 12(2), 320-334.
  2. Aslan, E., & Özüpak, Y. (2025a). Improving Accuracy Through Preprocessing and Data Augmentation Techniques with a Deep Learning-Based Approach for Arrhythmia Detection. International Journal of Integrated Engineering, 17(5), 376-388.
  3. Aslan, E., & Özüpak, Y. (2025b). Comparison of machine learning algorithms for automatic prediction of Alzheimer disease. Journal of the Chinese Medical Association, 88(2), 98-107.
  4. Aslan, E., Özüpak, Y., & Alpsalaz, F. (2025a). Boiler efficiency and performance optimization in district heating and cooling systems with machine learning models. Journal of the Chinese Institute of Engineers, 1-16.
  5. Aslan, E., Ozupak, Y., Alpsalaz, F., & Elbarbary, Z. M. (2025b). A Hybrid Machine Learning Approach for Predicting Power Transformer Failures Using Internet of Things Based Monitoring and Explainable Artificial Intelligence. IEEE Access.
  6. Assyifa, D. S., & Luthfiarta, A. (2024). SMOTE-Tomek Re-sampling Based on Random Forest Method to Overcome Unbalanced Data for Multi-class Classification. Inform: Jurnal Ilmiah Bidang Teknologi Informasi dan Komunikasi, 9(2), 151-160.
  7. Baddah, W., Qasem, H. A., Alsabry, A., Al Gawani, R. S., Alzuraiqi, W. M., & Hanash, F. E. (2024, August). Optimizing Heart Disease Prediction Models through SMOTE: Addressing Data Imbalance. In 2024 4th International Conference on Emerging Smart Technologies and Applications (eSmarTA) (pp. 1-10). IEEE.
  8. Barker, J., Li, X., Khavandi, S., Koeckerling, D., Mavilakandy, A., Pepper, C., ... & Ng, G. A. (2022). Machine learning in sudden cardiac death risk prediction: a systematic review. Europace, 24(11), 1777-1787.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Yazılım Mühendisliği (Diğer)

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

1 Haziran 2026

Gönderilme Tarihi

9 Ekim 2025

Kabul Tarihi

28 Aralık 2025

Yayımlandığı Sayı

Yıl 2026 Cilt: 16 Sayı: 2

Kaynak Göster

APA
Özsezer, G. (2026). Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data. Journal of the Institute of Science and Technology, 16(2), 461-487. https://doi.org/10.21597/jist.1800624
AMA
1.Özsezer G. Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data. Iğdır Üniv. Fen Bil Enst. Der. 2026;16(2):461-487. doi:10.21597/jist.1800624
Chicago
Özsezer, Gözde. 2026. “Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data”. Journal of the Institute of Science and Technology 16 (2): 461-87. https://doi.org/10.21597/jist.1800624.
EndNote
Özsezer G (01 Haziran 2026) Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data. Journal of the Institute of Science and Technology 16 2 461–487.
IEEE
[1]G. Özsezer, “Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data”, Iğdır Üniv. Fen Bil Enst. Der., c. 16, sy 2, ss. 461–487, Haz. 2026, doi: 10.21597/jist.1800624.
ISNAD
Özsezer, Gözde. “Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data”. Journal of the Institute of Science and Technology 16/2 (01 Haziran 2026): 461-487. https://doi.org/10.21597/jist.1800624.
JAMA
1.Özsezer G. Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data. Iğdır Üniv. Fen Bil Enst. Der. 2026;16:461–487.
MLA
Özsezer, Gözde. “Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data”. Journal of the Institute of Science and Technology, c. 16, sy 2, Haziran 2026, ss. 461-87, doi:10.21597/jist.1800624.
Vancouver
1.Gözde Özsezer. Comparison of Different Machine Learning Models with Data Balancing for Prediction of Cardiovascular Disease Risks Based on Big Data. Iğdır Üniv. Fen Bil Enst. Der. 01 Haziran 2026;16(2):461-87. doi:10.21597/jist.1800624