Timely diagnosis of heart disease remains a global healthcare priority. Although machine learning (ML) models offer promising solutions, issues such as overfitting, limited interpretability, and dependency on high-dimensional features persist. This study introduces a feature-efficient, stacking-based ensemble framework for heart disease prediction by combining dimensionality reduction with hybrid modeling. Five hybrid models were proposed and evaluated: Hybrid Model 1 used Logistic Regression (LR), Naive Bayes (NB), and Random Forest (RF) as base learners with Ridge Classifier as the meta-learner; Hybrid Model 2a employed LR, NB, and Extreme Gradient Boosting (XGB) with Ridge as the meta-learner; Hybrid Model 2b retained the same base learners as 2a but utilized LR as the meta-learner, Hybrid Model 3 used LR and NB as base learners with Ridge Classifier as the meta-learner and Hybrid Model 4 used LR, NB, and RF as base learners and LR as the meta-learner. Models were evaluated using 13, 9, and 6 feature subsets from the Cleveland dataset, employing both 80:20 train-test splits and 10-fold stratified cross-validation. Hybrid Model 1 attained the highest accuracy of 91.27% and AUC of 0.906 (95% CI: 0.852–0.961) using only 9 features. Performance metrics were supported by confidence intervals, ROC curve analysis, and confusion matrices. The results demonstrate that stacking ensembles with reduced, high-impact features provides scalable, interpretable, and accurate diagnostic support for cardiovascular healthcare. Future research will focus on external validation, explainability integration, and cross-population generalizability.
Primary Language | English |
---|---|
Subjects | Computing Applications in Life Sciences, Artificial Intelligence (Other) |
Journal Section | Research Article |
Authors | |
Early Pub Date | October 6, 2025 |
Publication Date | October 12, 2025 |
Submission Date | May 7, 2025 |
Acceptance Date | September 3, 2025 |
Published in Issue | Year 2025 Early View |