TY - JOUR T1 - Detection of COVID-19 Severity and Mortality from Blood Parameters by Ensemble Learning Methods TT - COVID-19 Şiddeti Ve Mortalitesinin Kan Parametrelerinden Kolektif Öğrenme Yöntemleri İle Tespiti AU - Erol, Gizemnur AU - Uzbaş, Betül PY - 2023 DA - December DO - 10.7212/karaelmasfen.1363912 JF - Karaelmas Fen ve Mühendislik Dergisi PB - Zonguldak Bulent Ecevit University WT - DergiPark SN - 2146-7277 SP - 316 EP - 328 VL - 13 IS - 2 LA - en AB - COVID-19 is a pandemic that causes a high rate of spread and Acute Respiratory Distress Syndrome (ARDS). Severe pneumonia in infected individuals has resulted in too many patients being admitted to the Intensive Care Unit (ICU). This has placed unprecedented pressure on health systems by exceeding capacities. It is essential to detect the prognosis of this disease so that the health systems can remain active and the conditions of the patients who need to be hospitalized in the ICU do not become critical. In this study, COVID-19 prognosis was detected by using ICU admission (COVID-19 SEVERITY) and COVID-19 related death (COVID- 19 MORTALITY) datasets with Machine Learning (ML) methods. The missing data of the datasets were filled with K-Nearest Neighbor (KNN), and Min-Max normalization was performed. Datasets were divided three times into training and test sets, and the data were balanced with the Synthetic Minority Oversampling Technique (SMOTE). Then, classification was carried out using Ensemble Learning (EL) methods. For COVID-19 SEVERITY and COVID-19 MORTALITY, 89.54% and 97.25% accuracy were achieved with the Adaboost classifier, respectively. Successful and rapid COVID-19 prognosis detection with ML methods will help to use the ICU more efficiently and relieve the pressure on health systems. KW - Classification models KW - hematological parameters KW - machine learning N2 - COVID-19, yüksek yayılım hızına ve Akut Solunum Sıkıntısı Sendromuna (ARDS) neden bir pandemidir. Enfekte bireylerde gelişen şiddetli pnömoni, çok fazla hastanın Yoğun Bakım Ünitesine (ICU) kabul edilmesine neden olmuştur. Bu da, sağlık sistemlerinde kapasitelerin aşılarak benzeri görülmemiş bir baskı meydana getirmiştir. Sağlık sistemlerinin aktif kalabilmesi ve ICU’ya yatması gereken hastaların durumlarının kritikleşmemesi için bu hastalığın prognozunun belirlenmesi oldukça önemlidir. Bu çalışmada, ICU’ya kabul edilen (COVID-19 SEVERITY ) ve COVID-19 nedeni ile ölen (COVID-19 MORTALITY ) hastaların bilgilerini içeren veri setleri, Makine Öğrenmesi (ML) yöntemleri kullanılarak COVID-19 prognoz tespiti yapılmıştır. Veri setlerinde bulunan eksik veriler K-En Yakın Komşu (KNN) ile tamamlanmış ve Min-Max normalizasyonu yapılmıştır. Veri setleri, eğitim ve test setleri olarak bölünmüş ve veriler Sentetik Azınlık Aşırı Örnekleme Tekniği (SMOTE) ile dengelenmiştir. Ardından, Kolektif Öğrenme (EL) yöntemleri kullanılarak sınıflandırma gerçekleştirilmiştir. COVID-19 SEVERITY ve COVID-19 MORTALITY için Adaboost sınıflandırıcısı ile sırasıyla %89.54 ve %97.25 başarı elde edilmiştir. ML yöntemleri ile COVID-19 prognozunun başarılı ve hızlı bir şekilde tespit edilmesi, ICU’yu daha verimli kullanmaya ve sağlık sistemlerinin üzerindeki baskıyı hafifletmeye yardımcı olacaktır. CR - Abeel,T., Van de Peer, Y., Saeys Y. 2009. Toward a gold standard for promoter prediction evaluation. Bioinformatics, 25(12): 313–320. Doi: 10.1093/bioinformatics/btp191 CR - Alabbad, D.A., Almuhaideb, A.M., Alsunaidi, S.J., Alqudaihi, K.S. 2022. Machine learning model for predicting the length of stay in the intensive care unit for Covid-19 patients in the eastern province of Saudi Arabia. Informatics in Medicine Unlocked, 30:100937. Doi: 10.1016/j.imu.2022.100937 CR - Anwar, H., Qamar, U., Muzaffar Qureshi, A.W. 2014. Global optimization ensemble model for classification methods. Sci. World J., 2014:313164. Doi: 10.1155/2014/313164 CR - Asch, D.A., Sheils, N.E., Islam, M.N., Chen, Y., Werner, R.M., Buresh, J., Doshi, J.A. 2021. Variation in US Hospital Mortality Rates for Patients Admitted with COVID-19 during the First 6 Months of the Pandemic. JAMA Intern. Med., 181(4):471–478. Doi: 10.1001/jamainternmed.2020.8193 CR - Bishop, C. M. 2006. Pattern recognition and machine learning (Information science and statistics). Springer-Verlag New York, Inc. Breiman L. 1996. Bagging predictors. Machine Learning. 24:123- 140. Doi: ttps://doi.org/10.1007/BF00058655 CR - Breiman L. 2001. Random forests. Machine Learning. 45:5-32. Doi: 10.1023/A:1010933404324 CR - Brinati, D., Campagner, A., Ferrari, D., Locatelli, M., Banfi, G., Cabitza, F. 2020. Detection of COVID-19 Infection from Routine Blood Exams with Machine Learning: A Feasibility Study. Journal of Medical Systems, 44(135). Doi: 10.1007/ s10916-020-01597-4 CR - Cabitza, F., Campagner, A., Ferrari, D., Di Resta, C., Ceriotti, D., Sabetta, E., Colombini, A., … Carobene, A. 2021. Development evaluation and validation of machine learning models for covid-19 detection based on routine blood tests. Clinical Chemistry and Laboratory Medicine (CCLM), 59(2):421-431. Doi: 10.1515/cclm-2020-1294 CR - Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16(1):321-357. Doi: 10.1613/jair.953 CR - Chen, S.G., Chen, J.Y., Yang, Y.P., Chien, C.S., Wang, M.L., Lina, L.T. 2020. Use of radiographic features in covid-19 diagnosis: Challenges and perspectives. Journal of the Chinese Medical Association, 83(7):644-647. Doi: 10.1097/ JCMA.0000000000000336 CR - Dai, W., Li, D., Tang, D., Wang, H., Peng, Y. 2022. Deep learning approach for defective spot welds classification using small and class-imbalanced datasets. Neurocomputing, 477(8):46-60. Doi: 10.1016/j.neucom.2022.01.004 CR - Dey, N., Mishra R., Fong S.J., Santosh K.C., Tan S., Crespo R.G. 2020. COVID-19: Psychological and psychosocial impact, fear, and passion. Digit. Gov.: Res. Pract., 1, 1–4. Doi: 10.1145/3428088 CR - Dong, X., Yu, Z., Cao, W., Shi, Y., Ma, Q. 2020. A survey on ensemble learning. Front. Comput. Sci., 14(2):241-258. Doi: 10.1007/s11704-019-8208-z CR - Douzas, G., Bação, F., Last, F. 2018. Improving imbalanced learning through a heuristic oversampling method based on k‐means and SMOTE. Inform Sci., 465:1‐20. Doi: 10.1016/j. ins.2018.06.056 CR - Elshennawy, N.M., Ibrahim, D.M., Sarhan, A.M., Arafa, M. 2022. Deep-Risk: Deep Learning-Based Mortality Risk Predictive Models for COVID-19. Diagnostics (Basel), 12(8):1847. Doi: 10.3390/diagnostics12081847 CR - Erol, G., Uzbaş, B., Yücelbaş, C., Yücelbaş, Ş. 2022. Analyzıng The Effect Of Data Pre-Processing Techniques Using Machine Learning Algorithms On The Diagnosis Of COVID-19. Concurrency and Computation-Practice & Experience, 34(28). Doi: 10.1002/cpe.7393 CR - Erol Doğan, G., Uzbaşa, B. 2023. Diagnosis of COVID-19 from blood parameters using convolutional neural network. Soft Computing, 27(15):10555–10570. Doi: 10.1007/s00500-023- 08508-y CR - Famiglini, L., Bini, G., Carobene, A., Campagner, A., Cabitza, F. 2021. Prediction of ICU admission for COVID-19 patients: a Machine Learning approach based on Complete Blood Count data. 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS). Doi: 10.1109/ CBMS52027.2021.00065 CR - Feigin, E., Levinson, T., Wasserman, A., Shenhar-Tsarfaty, S. 2022. Age-Dependent Biomarkers for Prediction of InHospital Mortality in COVID-19 Patients. J. Clin. Med., 11(10):2682. Doi: 10.3390/jcm11102682 CR - Fernandez, A., Garcia, S., Herrera, F., Chawla, N. 2018. SMOTE for Learning from Imbalanced Data: Progress and Challenges, Marking the 15-year Anniversary. Journal of Artificial Intelligence Research, 61(1):863-905. Doi: 10.1613/ jair.1.11192 CR - Fong, S., Li G., Dey N., Crespo R.G., Herrera-Viedma E. 2020. Finding an accurate early forecasting model from small dataset: A case of 2019-nCoV novel coronavirus outbreak. Int. J. Interact. Multimedia Artif. Intell., 6(1):1–10. Doi: 10.9781/ ijimai.2020.02.002 CR - Freud, Y. & Schapire, R.E. 1999. A Short Introduction to Boosting. Journal of Japanese Society for Artificial Intelligence, 14(5):771-780. CR - Ghani, A.C., Donnelly, C.A., Cox, D.R., Griffin, J.T., Fraser, C., Lam, T.H., Ho, L.M., … Leung, G.M. 2005. Methods for estimating the case fatality ratio for a novel, emerging infectious disease. Am J Epidemiol, 162(5):479–486. Doi: 10.1093/aje/kwi230 CR - Göreke, V., Sarı V., Kockanat S. 2021. A novel classifier architecture based on deep neural network for COVID-19 detection using laboratory findings. Applied Soft Computing, 106(1):107329. Doi: 10.1016/j.asoc.2021.107329 CR - Grasselli, G., Zangrillo, A., Zanella, A., Antonelli M., Cabrini, L., Castelli, A., Cereda, D., … COVID-19 Lombardy ICU Network, 2020. Baseline characteristics and outcomes of 1591 patients infected with sars-cov-2 admitted to icus of the lombardy region, italy. JAMA, 323(16):1574–1581. Doi: 10.1001/jama.2020.5394 CR - Hajian-Tilaki, K. 2013. Receiver Operating Characteristic (ROC) Curve Analysis for Medical Diagnostic Test Evaluation. Caspian Journal of Internal Medicinen, 4(2): 627–635. CR - Hu, Z., Melton, G.B., Arsoniadis, E.G., Wang, Y., Kwaan, M.R., Simon, G.J. 2017. Strategies for handling missing clinical data for automated surgical site infection detection from the electronic health record, 68: 112–120. Doi: 10.1016/j. jbi.2017.03.009 CR - Huang, H., Bader, J.S. 2009. Precision and recallestimatesfortwo hybrid screens. Bioinformatics, 25(3):372–378. Doi: 10.1093/ bioinformatics/btn640 CR - Hulley, S.B., Cummings, S.R., Browner, W.S., Grady, D., Hearst, N., Newman, T.B. 2001. Studies of the Accuracy of Tests. Designing Clinical Research An Epidemiologic Approach. Second Edition. Lippincott Williams & Wilkins, 181-2. CR - Idri, A., Abnane, I., Abran, A. 2016. Missing data techniques in analogy-based software development effort estimation. The Journal of Systems and Software, 117, 595–611. Doi: 10.1016/j.jss.2016.04.058 CR - Japkowicz, N., Shah, M. 2011. Evaluating Learning Algorithms, Cambridge University Press. Doi: 10.1017/ CBO9780511921803 CR - Kong, Y., Han, J., Wu, X., Zeng, H., Liu, J., Zhang, H. 2020. VEGF-D: a novel biomarker for detection of COVID-19 progression. Crit Care, 24(1):373. Doi: 10.1186/s13054-020- 03079-y CR - Kumar, Y., Koul, A., Singla, R., Ijaz, M.F. 2023. Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. Springer Nature, 14(7):8459–8486. Doi: 10.1007/s12652-021-03612-z CR - Wang, D., Hu, B., Hu, C., Zhu, F., Liu, X., Zhang, J., Wang, B., Peng Z. 2020. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. JAMA, 323(11):1061–9. Doi: 10.1001/ jama.2020.1585 CR - World Health Organization (WHO) 2020. Health topics, coronavirus. https://www.who.int/health-topics/ coronavirus#tab=tab _ 3 CR - World Health Organization (WHO) 2022. WHO Coronavirus (COVID-19) Dashboard: https://covid19.who.int/ CR - Willyard, C. 2020. Coronavirus blood-clot mystery intensifies. Nature, 581(7808):250. Doi: 10.1038/d41586-020-01403-8 CR - Wynants, L., Van Calster, B., Collins, G.S., Riley, R.D., Heinze, G., Schuit, E., Bonten, M.M.J., … van Smeden, M. 2020. Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal. BMJ, 7(369):m1328. Doi: 10.1136/bmj.m1328 CR - Mahaney, M.C., Brugnara, C., Lease, L.R., Platt, O.S. 2005. Genetic influences on peripheral blood cell counts: a study in baboons, 106(4): 1210–1214. Doi: 10.1182/ blood-2004-12-4863 CR - Metz, C.E. 1978. Basic principles of ROC analysis. Semin Nucl Med., 8(4):283-98. Doi: 10.1016/s0001-2998(78)80014-2 CR - Mohammed, A., Kora, R. 2023. A Comprehensive Review on Ensemble Deep Learning: Opportunities and Challenges. Journal of King Saud University - Computer and Information Sciences, 35(2):757-774. Doi: 10.1016/j.jksuci.2023.01.014 CR - Moulaei, K., Shanbehzadeh, M., Mohammadi-Taghiabad, Z., Kazemi-Arpanahi, H. 2022. Comparing machine learning algorithms for predicting COVID-19 mortality. BMC Medical Informatics and Decision Making, 22(1). Doi: 10.1186/s12911-021-01742-0 CR - Moore, J.B., June, C.H. 2020. Cytokine release syndrome in severe COVID-19. Science, 368(6490):473–474. Doi: 10.1126/science.abb8925 CR - Nagant, C., Ponthieuxa, F:, Smet, J., Dauby, N. 2020. A score combining early detection of cytokines accurately predicts COVID-19 severity and intensive care unit transfer. International Journal of Infectious Diseases, 101:342-345. Doi: 10.1016/j.ijid.2020.10.003 CR - Pasquier, G., Bounhiol, A., Gangneux, F.R., Zahar, J.R., Gangneux, J.P., Novara, A., Bougnoux, M.E., Dannaoui, E. 2021. A review of significance of Aspergillus detection in airways of ICU COVID-19 patients. Mycoses. 64(9):980-988. Doi: 10.1111/myc.13341. CR - Podder, P., Khamparia, A., Mondal, M.R.H., Rahman, M.A. (2021). Forecasting the Spread of COVID-19 and ICU Requirements, International Journal of Online and Biomedical Engineering (iJOE) 17(05):81. Doi: 10.3991/ijoe. v17i05.20009 CR - Prusa, J., Khoshgoftaar, T.M., Dittman, D.J. 2015. Using ensemble learners to improve classifier performance on tweet sentiment data. 2015 IEEE International Conference on Information Reuse and Integration, 252-257. Doi: 10.1109/ IRI.2015.49 CR - Rahman, M. and Davis, D.N. 2013. Addressing the Class Imbalance Problem in Medical Datasets, 3(1): 224. Doi: 10.7763/IJMLC.2013.V3.307 CR - Rodriguez-Nava, G., Yanez-Bello, M.A., Trelles-Garcia, D.P., Chung, C.W., Friedman, H.J., Hines, D.W. 2020. Performance of the quick covid-19 severity index and the brescia-covid respiratory severity scale in hospitalized patients with covid-19 in a community hospital setting. International Journal of Infectious Diseases, 102:571-576. https://doi. org/10.1016/j.ijid.2020.11.003 CR - Sagi, O., Rokach, L. 2018. Ensemble learning: A survey Wiley Interdiscip. Rev.: Data Min. Knowledge Discov., 8(4):e1249. Doi: 10.1002/widm.1249 CR - Shahzad, R.K., Lavesson, N. 2013. Comparative analysis of voting schemes for ensemble-based malware detection. J. Wireless Mobile Netw., Ubiquitous Comput. Dependable Appl., 4 (1):98-117. CR - Strålin, K., Wahlström, E., Walther, S., Bennet-Bark, A.M., Heurgren, M., Lindén, T., Holm, J., Hanberger, H. 2021. Mortality Trends among Hospitalised COVID-19 Patients in Sweden: A Nationwide Observational Cohort Study. Lancet Reg. Health Eur., 4, 100054. Doi: 10.1016/j. lanepe.2021.100054 CR - Strålin, K., Wahlström, E., Walther, S., Bennet-Bark, A.M., Heurgren, M., Lindén, T., Holm, J., Hanberger, H. 2022. Mortality in Hospitalized COVID-19 Patients Was Associated with the COVID-19 Admission Rate during the First Year of the Pandemic in Sweden. Infect. Dis., 54, 145– 151. Doi: 10.1080/23744235.2021.1983643 CR - Yang, H.S., Hou, Y., Vasovic, L.V., Steel, P.A., Chadburn, A., Racine-Brzostek, A.E., Velu, P., … Wang, F. 2020. Routine laboratory blood tests predict sars-cov-2 infection using machine learning. Clinical chemistry, 66(11):1396-1404. Doi: 10.1093/clinchem/hvaa200 UR - https://doi.org/10.7212/karaelmasfen.1363912 L1 - https://dergipark.org.tr/en/download/article-file/3423201 ER -