Research Article
BibTex RIS Cite

Predicting Liver Disease Using Decision Tree Ensemble Methods

Year 2022, Volume: 38 Issue: 2, 261 - 267, 23.08.2022

Abstract

Damages that may occur in the liver, which has an important task for the human body, can cause fatal consequences. For this reason, early diagnosis of liver disease is important. In this study, liver disease was tried to be diagnosed by using Ensemble learning methods, depending on several clinical values obtained from liver patients and healthy blood donors. In this context, Random Forest (RF), J48, AdaBoost, Gradient Boosting Classifiers (GBC), and Light Gradient Boosting Machine (Light GBM) algorithms from bagging and boosting models were used. The most successful classification result was obtained with the Light GBM algorithm as 98.8%, 98.1%, 99.4%, and 0.98%, respectively, in terms of accuracy, precision, recall, and kappa statistics using 10-fold cross-validation.

References

  • [1] World Health Organization (WHO), 2020. Hepatitis C Key Facts. https://www.who.int/newsroom/fact-sheets/detail/hepatitis-c (Accessed: September. 10, 2021).
  • [2] Hauri, A. M., Armstrong, G. L., & Hutin, Y. J. 2004. The global burden of disease attributable to contaminated injections given in health care settings. International journal of STD & AIDS, 15(1),pp.7-16.
  • [3] Khatun, M., & Ray, R. B. (2019). Mechanisms underlying hepatitis C virus-associated hepatic fibrosis. Cells, 8(10), 1249.
  • [4] Suk, K. T., & Kim, D. J. 2015. Staging of liver fibrosis or cirrhosis: The role of hepatic venous pressure gradient measurement. World journal of hepatology, 7(3), 607.
  • [5] Akkaya, O., Kiyici, M., Yilmaz, Y., Ulukaya, E., & Yerci, O. 2007. Clinical significance of activity of ALT enzyme in patients with hepatitis C virus. World journal of gastroenterology: WJG, 13(41), 5481.
  • [6] Pradat, P., Alberti, A., Poynard, T., Esteban, J. I., Weiland, O. et al. 2002. Predictive value of ALT levels for histologic findings in chronic hepatitis C: a European collaborative study. Hepatology, 36(4), pp.973-977.
  • [7] Awan, S. E., Bennamoun, M., Sohel, F., Sanfilippo, F. M., & Dwivedi, G. 2019. Machine learning‐based prediction of heart failure readmission or death: implications of choosing the right model and the right metrics. ESC heart failure, 6(2), pp.428-435..
  • [8] Oladimeji, O. O., Oladimeji, A., Olayanju, O. 2021. Machine Learning Models for Diagnostic Classification of Hepatitis C Tests. Frontiers in Health Informatics, 10(1), 70.
  • [9] Orooji, A., Kermani, F. 2021. Machine learning based methods for handling imbalanced data in hepatitis diagnosis. Frontiers in Health Informatics, 10(1), 57.
  • [10] Mostafa, F. B., Hasan, E. 2021. Machine Learning Approaches for Binary Classification to Discover Liver Diseases using Clinical Data. medRxiv.
  • [11] Gupta, S., Gupta, M. K. 2021. A comprehensive data‐level investigation of cancer diagnosis on imbalanced data. Computational Intelligence.
  • [12] Hoffmann, G., Bietenbeck, A., Lichtinghagen, R., Klawonn, F. 2018. Using machine learning techniques to generate laboratory diagnostic pathways—a case study. J Lab Precis Med, 3, 58.
  • [13] Dua, D., Graff, C. 2019. UCI Machine Learning Repository Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml. (Accessed: April, 26, 2021).
  • [14] Chawla, N. V., Bowyer, K. W., Hall, L. O., Kegelmeyer, W. P. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, pp.321-357.
  • [15] Yıldırım, P. 2016. Pattern classification with imbalanced and multiclass data for the prediction of albendazole adverse event outcomes. Procedia Computer Science, 83, pp.1013-1018.
  • [16] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. 2009. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1), pp.10-18.
  • [17] Types of Ensemble methods in Machine learning. Accessed: September. 11, 2021. [Online]. Available: https://towardsdatascience.com/types-of-ensemble-methods-in-machine-learning-4ddaf73879db. (Accessed: September. 11, 2021).
  • [18] Pal, M. 2005. Random forest classifier for remote sensing classification. International journal of remote sensing, 26(1), pp.217-222.
  • [19] Breiman, L. 1996. Bagging predictors. Machine learning, 24(2), pp.123-140.
  • [20] Quinlan, J. R. 2014. C4. 5: programs for machine learning. Elsevier.
  • [21] Freund, Y., Schapire, R. E. 1996. Experiments with a new boosting algorithm. In icml Vol. 96, pp. 148-156.
  • [22] Skurichina, M., Duin, R. P. 2002. Bagging, boosting and the random subspace method for linear classifiers. Pattern Analysis & Applications, 5(2), pp.121-135.
  • [23] Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., et al. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, pp.3146-3154.
  • [24] An, T. K., Kim, M. H. 2010. A new diverse AdaBoost classifier. In 2010 International conference on artificial intelligence and computational intelligence. IEEE, (Vol. 1, pp. 359-363).
  • [25] Friedman, J. H. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics, pp.1189-1232.
  • [26] Hossin, M., Sulaiman, M. N. 2015. A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, 5(2), 1.
  • [27] Suwardika, G. 2017. Pengelompokan Dan Klasifikasi Pada Data Hepatitis Dengan Menggunakan Support Vector Machine (SVM), Classification And Regression Tree (Cart) Dan Regresi Logistik Biner. Journal of Education Research and Evaluation, 1(3), pp.183-191.
  • [28] Chicco, D., Jurman, G. 2021. An ensemble learning approach for enhanced classification of patients with hepatitis and cirrhosis. IEEE Access, 9, pp.24485-24498.
  • [29] Hashem, S., ElHefnawi, M., Habashy, S., El-Adawy, M., Esmat, G., et al. 2020. Machine Learning Prediction Models for Diagnosing Hepatocellular Carcinoma with HCV-related Chronic Liver Disease. Computer methods and programs in biomedicine, 196, 105551.
Year 2022, Volume: 38 Issue: 2, 261 - 267, 23.08.2022

Abstract

References

  • [1] World Health Organization (WHO), 2020. Hepatitis C Key Facts. https://www.who.int/newsroom/fact-sheets/detail/hepatitis-c (Accessed: September. 10, 2021).
  • [2] Hauri, A. M., Armstrong, G. L., & Hutin, Y. J. 2004. The global burden of disease attributable to contaminated injections given in health care settings. International journal of STD & AIDS, 15(1),pp.7-16.
  • [3] Khatun, M., & Ray, R. B. (2019). Mechanisms underlying hepatitis C virus-associated hepatic fibrosis. Cells, 8(10), 1249.
  • [4] Suk, K. T., & Kim, D. J. 2015. Staging of liver fibrosis or cirrhosis: The role of hepatic venous pressure gradient measurement. World journal of hepatology, 7(3), 607.
  • [5] Akkaya, O., Kiyici, M., Yilmaz, Y., Ulukaya, E., & Yerci, O. 2007. Clinical significance of activity of ALT enzyme in patients with hepatitis C virus. World journal of gastroenterology: WJG, 13(41), 5481.
  • [6] Pradat, P., Alberti, A., Poynard, T., Esteban, J. I., Weiland, O. et al. 2002. Predictive value of ALT levels for histologic findings in chronic hepatitis C: a European collaborative study. Hepatology, 36(4), pp.973-977.
  • [7] Awan, S. E., Bennamoun, M., Sohel, F., Sanfilippo, F. M., & Dwivedi, G. 2019. Machine learning‐based prediction of heart failure readmission or death: implications of choosing the right model and the right metrics. ESC heart failure, 6(2), pp.428-435..
  • [8] Oladimeji, O. O., Oladimeji, A., Olayanju, O. 2021. Machine Learning Models for Diagnostic Classification of Hepatitis C Tests. Frontiers in Health Informatics, 10(1), 70.
  • [9] Orooji, A., Kermani, F. 2021. Machine learning based methods for handling imbalanced data in hepatitis diagnosis. Frontiers in Health Informatics, 10(1), 57.
  • [10] Mostafa, F. B., Hasan, E. 2021. Machine Learning Approaches for Binary Classification to Discover Liver Diseases using Clinical Data. medRxiv.
  • [11] Gupta, S., Gupta, M. K. 2021. A comprehensive data‐level investigation of cancer diagnosis on imbalanced data. Computational Intelligence.
  • [12] Hoffmann, G., Bietenbeck, A., Lichtinghagen, R., Klawonn, F. 2018. Using machine learning techniques to generate laboratory diagnostic pathways—a case study. J Lab Precis Med, 3, 58.
  • [13] Dua, D., Graff, C. 2019. UCI Machine Learning Repository Irvine, CA: University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml. (Accessed: April, 26, 2021).
  • [14] Chawla, N. V., Bowyer, K. W., Hall, L. O., Kegelmeyer, W. P. 2002. SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, pp.321-357.
  • [15] Yıldırım, P. 2016. Pattern classification with imbalanced and multiclass data for the prediction of albendazole adverse event outcomes. Procedia Computer Science, 83, pp.1013-1018.
  • [16] Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., & Witten, I. H. 2009. The WEKA data mining software: an update. ACM SIGKDD explorations newsletter, 11(1), pp.10-18.
  • [17] Types of Ensemble methods in Machine learning. Accessed: September. 11, 2021. [Online]. Available: https://towardsdatascience.com/types-of-ensemble-methods-in-machine-learning-4ddaf73879db. (Accessed: September. 11, 2021).
  • [18] Pal, M. 2005. Random forest classifier for remote sensing classification. International journal of remote sensing, 26(1), pp.217-222.
  • [19] Breiman, L. 1996. Bagging predictors. Machine learning, 24(2), pp.123-140.
  • [20] Quinlan, J. R. 2014. C4. 5: programs for machine learning. Elsevier.
  • [21] Freund, Y., Schapire, R. E. 1996. Experiments with a new boosting algorithm. In icml Vol. 96, pp. 148-156.
  • [22] Skurichina, M., Duin, R. P. 2002. Bagging, boosting and the random subspace method for linear classifiers. Pattern Analysis & Applications, 5(2), pp.121-135.
  • [23] Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., et al. (2017). Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 30, pp.3146-3154.
  • [24] An, T. K., Kim, M. H. 2010. A new diverse AdaBoost classifier. In 2010 International conference on artificial intelligence and computational intelligence. IEEE, (Vol. 1, pp. 359-363).
  • [25] Friedman, J. H. 2001. Greedy function approximation: a gradient boosting machine. Annals of statistics, pp.1189-1232.
  • [26] Hossin, M., Sulaiman, M. N. 2015. A review on evaluation metrics for data classification evaluations. International journal of data mining & knowledge management process, 5(2), 1.
  • [27] Suwardika, G. 2017. Pengelompokan Dan Klasifikasi Pada Data Hepatitis Dengan Menggunakan Support Vector Machine (SVM), Classification And Regression Tree (Cart) Dan Regresi Logistik Biner. Journal of Education Research and Evaluation, 1(3), pp.183-191.
  • [28] Chicco, D., Jurman, G. 2021. An ensemble learning approach for enhanced classification of patients with hepatitis and cirrhosis. IEEE Access, 9, pp.24485-24498.
  • [29] Hashem, S., ElHefnawi, M., Habashy, S., El-Adawy, M., Esmat, G., et al. 2020. Machine Learning Prediction Models for Diagnosing Hepatocellular Carcinoma with HCV-related Chronic Liver Disease. Computer methods and programs in biomedicine, 196, 105551.
There are 29 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Fırat Orhanbulucu 0000-0003-4558-9667

İrem Acer This is me

Fatma Latifoğlu

Semra İçer

Early Pub Date August 23, 2022
Publication Date August 23, 2022
Published in Issue Year 2022 Volume: 38 Issue: 2

Cite

APA Orhanbulucu, F., Acer, İ., Latifoğlu, F., İçer, S. (2022). Predicting Liver Disease Using Decision Tree Ensemble Methods. Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, 38(2), 261-267.
AMA Orhanbulucu F, Acer İ, Latifoğlu F, İçer S. Predicting Liver Disease Using Decision Tree Ensemble Methods. Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi. August 2022;38(2):261-267.
Chicago Orhanbulucu, Fırat, İrem Acer, Fatma Latifoğlu, and Semra İçer. “Predicting Liver Disease Using Decision Tree Ensemble Methods”. Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi 38, no. 2 (August 2022): 261-67.
EndNote Orhanbulucu F, Acer İ, Latifoğlu F, İçer S (August 1, 2022) Predicting Liver Disease Using Decision Tree Ensemble Methods. Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi 38 2 261–267.
IEEE F. Orhanbulucu, İ. Acer, F. Latifoğlu, and S. İçer, “Predicting Liver Disease Using Decision Tree Ensemble Methods”, Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, vol. 38, no. 2, pp. 261–267, 2022.
ISNAD Orhanbulucu, Fırat et al. “Predicting Liver Disease Using Decision Tree Ensemble Methods”. Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi 38/2 (August 2022), 261-267.
JAMA Orhanbulucu F, Acer İ, Latifoğlu F, İçer S. Predicting Liver Disease Using Decision Tree Ensemble Methods. Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi. 2022;38:261–267.
MLA Orhanbulucu, Fırat et al. “Predicting Liver Disease Using Decision Tree Ensemble Methods”. Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, vol. 38, no. 2, 2022, pp. 261-7.
Vancouver Orhanbulucu F, Acer İ, Latifoğlu F, İçer S. Predicting Liver Disease Using Decision Tree Ensemble Methods. Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi. 2022;38(2):261-7.

✯ Etik kurul izni gerektiren, tüm bilim dallarında yapılan araştırmalar için etik kurul onayı alınmış olmalı, bu onay makalede belirtilmeli ve belgelendirilmelidir.
✯ Etik kurul izni gerektiren araştırmalarda, izinle ilgili bilgilere (kurul adı, tarih ve sayı no) yöntem bölümünde, ayrıca makalenin ilk/son sayfalarından birinde; olgu sunumlarında, bilgilendirilmiş gönüllü olur/onam formunun imzalatıldığına dair bilgiye makalede yer verilmelidir.
✯ Dergi web sayfasında, makalelerde Araştırma ve Yayın Etiğine uyulduğuna dair ifadeye yer verilmelidir.
✯ Dergi web sayfasında, hakem, yazar ve editör için ayrı başlıklar altında etik kurallarla ilgili bilgi verilmelidir.
✯ Dergide ve/veya web sayfasında, ulusal ve uluslararası standartlara atıf yaparak, dergide ve/veya web sayfasında etik ilkeler ayrı başlık altında belirtilmelidir. Örneğin; dergilere gönderilen bilimsel yazılarda, ICMJE (International Committee of Medical Journal Editors) tavsiyeleri ile COPE (Committee on Publication Ethics)’un Editör ve Yazarlar için Uluslararası Standartları dikkate alınmalıdır.
✯ Kullanılan fikir ve sanat eserleri için telif hakları düzenlemelerine riayet edilmesi gerekmektedir.