Improving ICU Mortality Prediction via Meta-Learning and Explainable AI: A MetaCost and LIME Approach
Abstract
This study aimed to enhance mortality prediction for Intensive Care Unit (ICU) patients using a Meta-Learning approach and to evaluate the explainability of individual predictions using the LIME (Local Interpretable Model-agnostic Explanations) method. This study analyzed 428 patient records from the MIMIC-III database, including 48 variables including demographics, laboratory results (e.g., Anion gap, Urea nitrogen), and comorbidities. The dataset was imbalanced, with 15% mortality and 85% survival. To address this issue, machine learning models (e.g., Gradient Boosting, Random Forest) were adapted using the MetaCost algorithm, which is a meta-learning method. Performance was evaluated using metrics suited for imbalanced data, such as Average Precision (AP), recall, F2 score, and the Matthews correlation coefficient (MCC). Feature importance was validated statistically, and LIME was applied for per-patient interpretability. Univariate analysis identified 24 statistically significant features (P<0.01) differentiating between deceased and surviving patients. The MetaCost-enhanced Gradient Boosting model achieved the best performance, with an AUC of 0.91, AP of 0.75, recall of 0.86, F2 score of 0.85, and MCC of 0.79. The MetaCost algorithm effectively improves ICU mortality prediction accuracy, while LIME enhances interpretability at the individual patient level. This approach can make clinical decision support systems more transparent and reliable. However, further validation on diverse datasets is required to confirm these findings.
Keywords
ICU mortality prediction, Machine learning, Logistic regression, Explainable AI
References
- Bagshaw, S. M., George, C., Bellomo, R., and ANZICS Database Management Committee. (2008). A comparison of the RIFLE and AKIN criteria for acute kidney injury in critically ill patients. Nephrology Dialysis Transplantation, 23(5), 1569–1574. https://doi.org/10.1093/ndt/gfn009
- Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324
- Çanga, D., and Boğa, M. (2020). Determination of the effect of some properties on egg yield with regression analysis method bagging Mars and R application. Turkish Journal of Agriculture - Food Science and Technology, 8(8), 1705–1712. https://doi.org/10.24925/turjaf.v8i8.1705-1712.3468
- Çanga, D., and Boğa, M. (2022). Detection of correct pregnancy status in lactating dairy cattle using MARS data mining algorithm. Turkish Journal of Veterinary & Animal Sciences, 46(6), 809–819. https://doi.org/10.55730/1300-0128.4257
- Çanga Boğa, D., Boğa, M., and Tırınk, C. (2024). Comparison of nonlinear functions to define the growth in intensive feedlot system with XGBoost algorithm. Turkish Journal of Agriculture - Food Science and Technology, 12(8), 1408–1416. https://doi.org/10.24925/turjaf.v12i8.1408-1416.6562
- Çelik, S., and Yilmaz, O. (2021). The relationship between the coat colors of Kars shepherd dog and its morphological characteristics using some data mining methods. International Journal of Livestock Research, 11(1), 53–61. https://doi.org/10.5455/ijlr.20200604
- Chen, T., and Guestrin, C. (2016). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 785–794). Association for Computing Machinery. https://doi.org/10.1145/2939672.2939785
- Cheng, B., Li, D., Gong, Y., Ying, B., and Wang, B. (2020). Serum anion gap predicts all-cause mortality in critically ill patients with acute kidney injury: Analysis of the MIMIC-III database. Disease Markers, 2020, Article 6501272. https://doi.org/10.1155/2020/6501272
- Chicco, D., Tötsch, N., and Jurman, G. (2021). The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Mining, 14, 13. https://doi.org/10.1186/s13040-021-00244-z
- Domingos, P. (1999). MetaCost: A general method for making classifiers cost-sensitive. In Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining (pp. 155–164). Association for Computing Machinery.