TY - JOUR T1 - Lightweight and Scalable Hybrid Ensemble Learning with Reduced Feature Sets for Robust and Interpretable Heart Disease Prediction AU - Parmar, Kumar AU - Jain, Rituraj AU - Palaniappan, Damodharan AU - T, Premavathi PY - 2025 DA - October Y2 - 2025 JF - Gazi University Journal of Science PB - Gazi University WT - DergiPark SN - 2147-1762 SP - 1 EP - 1 LA - en AB - Timely diagnosis of heart disease remains a global healthcare priority. Although machine learning (ML) models offer promising solutions, issues such as overfitting, limited interpretability, and dependency on high-dimensional features persist. This study introduces a feature-efficient, stacking-based ensemble framework for heart disease prediction by combining dimensionality reduction with hybrid modeling. Five hybrid models were proposed and evaluated: Hybrid Model 1 used Logistic Regression (LR), Naive Bayes (NB), and Random Forest (RF) as base learners with Ridge Classifier as the meta-learner; Hybrid Model 2a employed LR, NB, and Extreme Gradient Boosting (XGB) with Ridge as the meta-learner; Hybrid Model 2b retained the same base learners as 2a but utilized LR as the meta-learner, Hybrid Model 3 used LR and NB as base learners with Ridge Classifier as the meta-learner and Hybrid Model 4 used LR, NB, and RF as base learners and LR as the meta-learner. Models were evaluated using 13, 9, and 6 feature subsets from the Cleveland dataset, employing both 80:20 train-test splits and 10-fold stratified cross-validation. Hybrid Model 1 attained the highest accuracy of 91.27% and AUC of 0.906 (95% CI: 0.852–0.961) using only 9 features. Performance metrics were supported by confidence intervals, ROC curve analysis, and confusion matrices. The results demonstrate that stacking ensembles with reduced, high-impact features provides scalable, interpretable, and accurate diagnostic support for cardiovascular healthcare. Future research will focus on external validation, explainability integration, and cross-population generalizability. KW - Ensemble learning KW - Feature selection KW - Heart disease KW - Hybrid models KW - Machine learning CR - [1] Chang, Y., Dong, M., Fan, L., Kang, B., Sun, W., Li, X., Yang, Z., and Ren, M., “Research on noninvasive electrophysiologic imaging based on cardiac electrophysiology simulation and deep learning methods for the inverse problem”, BMC Cardiovascular Disorders, 25(1): 335, (2025). DOI: https://doi.org/10.1186/s12872-025-04728-2 CR - [2] Bansal, A., and Hiwale, K., “Updates in the management of coronary artery disease: A review article”, Cureus, 15(12): e50644, (2023). DOI: https://doi.org/10.7759/cureus.50644. CR - [3] Iacobescu, P., Marina, V., Anghel, C., and Anghele, A.D., “Evaluating binary classifiers for cardiovascular disease prediction: Enhancing early diagnostic capabilities”, Journal of Cardiovascular Development and Disease, 11(12): 396, (2024). DOI: https://doi.org/10.3390/jcdd11120396. CR - [4] M.A. Chowdhury, R. Rizk, C. Chiu, J.J. Zhang, J.L. Scholl, T.J. Bosch, A. Singh, L.A. Baugh, J.S. McGough, K. Santosh, and W.C.W. Chen, “The heart of transformation: exploring artificial intelligence in cardiovascular disease”, Biomedicines, 13(2): 427, (2025). DOI: https://doi.org/10.3390/biomedicines13020427. CR - [5] Chowdhury, M. A., Rizk, R., Chiu, C., Zhang, J. J., Scholl, J. L., Bosch, T. J., Singh, A., Baugh, L. A., McGough, J. S., Santosh, K., and Chen, W. C. W., “Delving into machine learning’s ınfluence on disease diagnosis and prediction”, The Open Public Health Journal, 17(1): (2024). DOI: https://doi.org/10.2174/0118749445297804240401061128. CR - [6] Ogunpola, A., Saeed, F., Basurra, S., Albarrak, A. M., and Qasem, S. N., “Machine Learning-Based Predictive Models for detection of cardiovascular diseases”, Diagnostics, 14(2): 144, (2024). DOI: https://doi.org/10.3390/diagnostics14020144. CR - [7] Chaudhari, J. P. , Patel, K. P., Mewada, H. K., Jayswal, H. S., Kosta, Y. P., Bhagat, K. S., and Kirange, S. D., “Recursive feature elimination and optimized hybrid ensemble approach for early heart disease prediction”, Advances in Technology Innovation, 10(1): 58–71, (2025). DOI: https://doi.org/10.46604/aiti.2024.13825. CR - [8] Al-Alshaikh, H. A., P, P., Poonia, R. C., Saudagar, A. K. J., Yadav, M., AlSagri, H. S., and AlSanad, A. A., “Comprehensive evaluation and performance analysis of machine learning in heart disease prediction”, Scientific Reports, 14(1): 7819, (2024). DOI: https://doi.org/10.1038/s41598-024-58489-7. CR - [9] Mahajan, P., Uddin, S., Hajati, F., and Moni, M. A., “Ensemble Learning for Disease Prediction: A review”, Healthcare, 11(12): 1808, (2023). DOI: https://doi.org/10.3390/healthcare11121808. CR - [10] Cinar, I., Taspinar, Y. S., and Koklu, M., “Development of early stage diabetes prediction model based on stacking approach”, Tehnički Glasnik, 17(2): 153–159, (2023). DOI: https://doi.org/10.31803/tg-20211119133806. CR - [11] Islam, Md. M., Tania, T. N., Akter, S., and Shakib, K. H., “An improved heart disease prediction using stacked ensemble method”, in M.S. (eds) Machine Intelligence and Emerging Technologies. MIET 2022. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, 84–97, (2023). DOI: https://doi.org/10.1007/978-3-031-34619-4_8. CR - [12] Mohan, K., Al-Mamari, Y. F. S., and Al-Najadi, M. A. M., “Techniques of machine learning for detecting heart failure”, International Journal of Data Informatics and Intelligent Computing, 2(2): 47–54, (2023). DOI: https://doi.org/10.59461/ijdiic.v2i2.62. CR - [13] Naser, M. A., Majeed, A. A., Alsabah, M., Al-Shaikhli, T. R., and Kaky, K. M., “A review of machine learning’s role in cardiovascular disease prediction: Recent advances and future challenges”, Algorithms, 17(2): 78, (2024). DOI: https://doi.org/10.3390/a17020078. CR - [14] Chen, R.-C., Dewi, C., Huang, S.-W., and Caraka, R. E., “Selecting critical features for data classification based on machine learning methods”, Journal of Big Data, 7(1): 52, (2020). DOI: https://doi.org/10.1186/s40537-020-00327-4. CR - [15] Dutta, A., Batabyal, T., Basu, M., and Acton, S. T., “An efficient convolutional neural network for coronary heart disease prediction”, Expert Systems With Applications, 159: 113408, (2020). DOI: https://doi.org/10.1016/j.eswa.2020.113408. CR - [16] Archana, K. S., Sivakumar, B., Kuppusamy, R., Teekaraman, Y., and Radhakrishnan, A. “Automated cardioailment identification and Prevention by hybrid machine learning models”, Computational and Mathematical Methods in Medicine, 2022: 1–8, (2022). DOI: https://doi.org/10.1155/2022/9797844. CR - [17] Lu, T., “From Data to diagnosis: Effective machine learning-based heart disease prediction”, Science and Technology of Engineering Chemistry and Environmental Protection, 1(9): (2024). DOI: https://doi.org/10.61173/7fqjnk03. CR - [18] Chen, L., “Heart disease prediction utilizing machine learning techniques”, Transactions on Materials Biotechnology and Life Sciences, 3: 35–50, (2024). DOI: https://doi.org/10.62051/e054hq43. CR - [19] Nazir, N., Saraogi, A., Kaur, K., Bera, V., Singh, A. K., Dahiya, O., Singh, N. K., Bhardwaj, K., and Yadav, S. A., “Heart disease prediction using machine learning”, in 2023 6th International Conference on Contemporary Computing and Informatics (IC3I), IEEE, 62–66, (2023). DOI: https://doi.org/10.1109/ic3i59117.2023.10397848. CR - [20] Ekong, A., “Evaluation of machine learning techniques towards early detection of cardiovascular diseases”, American Journal of Artificial Intelligence, 7(1): 6-16, (2023). DOI: https://doi.org/10.11648/j.ajai.20230701.12. CR - [21] Zhang, H., and Mu, R., “Refining heart disease prediction accuracy using hybrid machine learning techniques with novel metaheuristic algorithms”, International Journal of Cardiology, 416: 132506, (2024). DOI: https://doi.org/10.1016/j.ijcard.2024.132506. CR - [22] Bhamare, V., Chikhale, S. R., Sawakare, N. S., Kurkunde, A. Y., and Autade, M. S., “Heart disease prediction using machine learning algorithms”, International Journal for Research in Applied Science and Engineering Technology, 12(4): 559–564, (2024). DOI: https://doi.org/10.22214/ijraset.2024.59580. CR - [23] Ishikita, A., McIntosh, C., Hanneman, K. , Lee, M.M., Liang, T., Karur, G.R., Roche, S.L., Hickey, E., Geva, T., Barron, D.J., and Wald, R.M., “Machine learning for prediction of adverse cardiovascular events in adults with repaired tetralogy of FALLOT using clinical and cardiovascular magnetic resonance imaging variables”, Circulation Cardiovascular Imaging, 16(6): (2023). DOI: https://doi.org/10.1161/circimaging.122.015205. CR - [24] Domyati, A., and Memon, Q, “Machine learning based improved heart disease detection with confidence”, International Journal of Online and Biomedical Engineering, 19(8): 130–143, (2023). DOI: https://doi.org/10.3991/ijoe.v19i08.37417. CR - [25] Ramalingamsakthivelan, N. M. K., Silambarasan, V., Thavasi, S., and Shankar, P. V., “Heart disease risk assessment by using light GBM technique”, International Journal for Multidisciplinary Research, 5(2): (2023). DOI: https://doi.org/10.36948/ijfmr.2023.v05i02.2620. CR - [26] Pavithraa, G., “Analysis and comparison of prediction of heart disease using novel genetic algorithm and XGBOOST algorithm”, Cardiometry, 25: 778–782, (2023). DOI: https://doi.org/10.18137/cardiometry.2022.25.778782. CR - [27] Gadde, R., and Kumar, N. S., “Analysis and comparison of neural network algorithm for prediction of cardiovascular disease over support vector machine algorithm with improved precision”, Cardiometry, 25: 970–976, (2023). DOI: https://doi.org/10.18137/cardiometry.2022.25.970976. CR - [28] Gadde, R., and Kumar, N. S., “Analysis and Comparison of Random Forest Algorithm for Prediction of Cardiovascular Disease over Support Vector Machine Algorithm with Improved Precision”, Cardiometry, 25: 977–982, (2023). DOI: https://doi.org/10.18137/cardiometry.2022.25.977982. CR - [29] Pavithraa, G., and Sivaprasad, S., “Analysis and comparison of prediction of heart disease using novel random Forest and Naive Bayes algorithm”, Cardiometry, 25: 788–793, (2023). DOI: https://doi.org/10.18137/cardiometry.2022.25.788793. CR - [30] Janosi, A., Steinbrunn, W., Pfisterer, M., and Detrano, R., "Heart Disease," UCI Machine Learning Repository, (1989). https://DOI: https://doi.org/.org/10.24432/C52P4X. CR - [31] Marelli, A.J., Li, C., Liu, A., Nguyen, H., Moroz, H., Brophy, J.M., Guo, L., Buckeridge, D.L., Tang, J., Yang, A.Y., and Li, Y., “Machine learning informed diagnosis for congenital heart disease in large claims data source”, Journal of the American College of Cardiology: Advances, 3(2): 100801, (2023). DOI: https://doi.org/10.1016/j.jacadv.2023.100801. CR - [32] Segar, M. W., Jaeger, B. C., Patel, K. V., Nambi, V., Ndumele, C. E., Correa, A., Butler, J., Chandra, A., Ayers, C., Rao, S., Lewis, A. A., Raffield, L. M., Rodriguez, C. J., Michos, E. D., Ballantyne, C. M., Hall, M. E., Mentz, R. J., de Lemos, J. A., and Pandey, A., “Development and validation of machine learning–based race-specific models to predict 10-year risk of heart failure: a multicohort analysis”, Circulation, 143(24): 2370–2383, (2021). DOI: https://doi.org/10.1161/circulationaha.120.053134. CR - [33] Siefkes, H., Oliveira, L. C., Koppel, R., Hogan, W., Garg, M., Manalo, E., Cresalia, N., Lai, Z., Tancredi, D., Lakshminrusimha, S., Chuah, C., “Machine learning–based critical congenital heart disease screening using dual‐site pulse oximetry measurements”, Journal of the American Heart Association, 13(12): (2024). DOI: https://doi.org/10.1161/jaha.123.033786. UR - https://dergipark.org.tr/en/pub/gujs/issue//1694513 L1 - https://dergipark.org.tr/en/download/article-file/4848137 ER -