TY - JOUR T1 - MACHINE LEARNING-BASED CLASSIFICATION OF HBV AND HCV-RELATED HEPATOCELLULAR CARCINOMA USING GENOMIC BIOMARKERS TT - GENOMİK BİYOBELİRTEÇLER KULLANILARAK HBV VE HCV İLE İLİŞKİLİ HEPATOSELLÜLER KARSİNOMUN MAKİNE ÖĞRENİMİ TABANLI SINIFLANDIRILMASI AU - Akbulut, Sami AU - Küçükakçalı, Zeynep AU - Çolak, Cemil PY - 2022 DA - October DO - 10.26650/IUITFD.1130442 JF - Journal of Istanbul Faculty of Medicine JO - İst Tıp Fak Derg PB - Istanbul University WT - DergiPark SN - 1305-6441 SP - 532 EP - 540 VL - 85 IS - 4 LA - en AB - Objective: It is crucial to know the underlying causes of hepatocellular carcinoma (HCC) for optimal management. This study aims to classify open access gene expression data of HCC patients who have an HBV or HCV infection using the XGboost method. Material and Methods: This case-control study considered the open-access gene expression data of patients with HBV-related HCC and HCV-related HCC. For this purpose, data from 17 patients with HBV+HCC and 17 patients with HCV+HCC were included. XGboost was constructed for the classification via tenfold cross-validation. Accuracy, balanced accuracy, sensitivity, specificity, the positive predictive value, the negative predictive value, and F1 score performance metrics were evaluated for a model performance. Results: With the feature selection approach, 17 genes were chosen, and modeling was done using these input variables. Accuracy, balanced accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and the F1 score obtained from the XGboost model were 97.1%, 97.1%, 94.1%, 100%, 100%, 94.4%, and 97%, respectively. Based on the variable importance findings from the XGboost, the ALDOC, GLUD2, TRAPPC10, FLJ12998, RPL39, KDELR2, and KIAA0446 genes can be employed as potential biomarkers for HBV-related HCC. Conclusion: As a result of the study, two different etiological factors (HBV and HCV) causing HCC were classified using a machine learning-based prediction approach, and genes that could be biomarkers for HBV-related HCC were identified. After the resulting genes have been clinically validated in subsequent research, therapeutic procedures based on these genes can be established and their utility in clinical practice documented. KW - Hepatocellular carcinoma KW - Hepatitis B infection KW - Hepatitis C infection KW - machine learning KW - classification N2 - Amaç: Hepatoselüler karsinomun (HCC) optimal yönetimi için altında yatan nedenleri bilmek çok önemlidir. Bu çalışma, HBV veya HCV enfeksiyonu olan HCC hastalarının açık erişim gen ekspresyon verilerini XGboost yöntemini kullanarak sınıflandırmayı amaçlamaktadır. Gereç ve Yöntem: Bu vaka-kontrol çalışmasında, HBV ve HCV ile ilişkili HCC’li hastaların açık erişimli gen ekspresyonu verileri dikkate alınmıştır. Bu amaçla, 17 HBV+HCC ve 17 HCV+HCC hastadan elde edilen veriler çalışmaya dahil edildi. Sınıflandırma için on katlı çapraz geçerlilik kullanılarak XGboost modeli oluşturuldu. Model performansı için doğruluk, dengeli doğruluk, duyarlılık, özgüllük, pozitif tahmin değeri ve negatif tahmin değeri ve F1 skor performans metrikleri değerlendirildi. Bulgular: Özellik seçimi yaklaşımı ile 17 gen seçilmiş ve bu girdi değişkenleri kullanılarak modelleme yapılmıştır. XGboost modelinden elde edilen doğruluk, dengeli doğruluk, duyarlılık, özgüllük, pozitif tahmin değeri, negatif tahmin değeri ve F1 skor sırasıyla %97,1, %97,1, %94,1, %100, %100, %94,4 ve %97 idi. XGboost’tan elde edilen değişken önemliliği bulgularına dayanarak, ALDOC, GLUD2, TRAPPC10, FLJ12998, RPL39, KDELR2 ve KIAA0446 genleri, HBV ile ilişkili HCC için potansiyel biyobelirteçler olarak kullanılabilir. Sonuç: Çalışma sonucunda, HCC’ye neden olan iki farklı etiyolojik faktör (HBV ve HCV), makine öğrenimi tabanlı bir tahmin yaklaşımı kullanılarak sınıflandırıldı ve HBV ile ilişkili HCC için biyobelirteç olabilecek genler tanımlandı. Ortaya çıkan genler sonraki araştırmalarda klinik olarak doğrulandıktan sonra, bu genlere dayalı terapötik prosedürler oluşturulabilir ve klinik uygulamada kullanımları belgelenebilir. Anahtar Kelimeler: CR - 1. Aly A, Ronnebaum S, Patel D, Doleh Y, Benavente F. Epidemiologic, humanistic and economic burden of hepatocellular carcinoma in the USA: a systematic literature review. Hepat Oncol 2020;7(3):HEP27. doi: 10.2217/hep-2020-0024. [CrossRef] google scholar CR - 2. Llovet JM, Kelley RK, Augusto V, Singal AG, Eli P, Sasan R, et al. Hepatocellular carcinoma. Nat Rev Dis Primers 2021;7(1):6. [CrossRef] google scholar CR - 3. Sayiner M, Golabi P, Younossi ZM. Disease burden of hepatocellular carcinoma: a global perspective. Dig Dis Sci 2019;64(4):910-7. [CrossRef] google scholar CR - 4. Levrero M, Zucman-Rossi J. Mechanisms of HBV-induced hepatocellular carcinoma. J Hepatol 2016;64(1 Suppl):S84-101. [CrossRef] google scholar CR - 5. El-Serag HB, Rudolph KL. Hepatocellular carcinoma: epidemiology and molecular carcinogenesis. Gastroenterology 2007;132(7):2557-76. [CrossRef] google scholar CR - 6. Venook AP, Papandreou C, Furuse J, Ladron de Guevara L. The incidence and epidemiology of hepatocellular carcinoma: a global and regional perspective. Oncologist 2010;15(S4):5-13. [CrossRef] google scholar CR - 7. Fattovich G, Stroffolini T, Zagni I, Donato F. Hepatocellular carcinoma in cirrhosis: incidence and risk factors. Gastroenterology 2004;127(5):S35-50. [CrossRef] google scholar CR - 8. Ming L, Thorgeirsson SS, Gail MH, Lu P, Harris CC, Wang N, et al. Dominant role of hepatitis B virus and cofactor role of aflatoxin in hepatocarcinogenesis in Qidong, China. Hepatology 2002;36(5):1214-20. [CrossRef] google scholar CR - 9. Chen C-J, Yang H-I, Su J, Jen C-L, You S-L, Lu S-N, et al. Risk of hepatocellular carcinoma across a biological gradient of serum hepatitis B virus DNA level. JAMA 2006;295(1):65-73. [CrossRef] google scholar CR - 10. Di Bisceglie AM. Hepatitis B and hepatocellular carcinoma. Hepatology 2009;49(S5): S56-60. [CrossRef] google scholar CR - 11. Dash S, Aydin Y, Widmer KE, Nayak L. Hepatocellular carcinoma mechanisms associated with chronic HCV infection and the impact of direct-acting antiviral treatment. J Hepatocell Carcinoma 2020;7:45. [CrossRef] google scholar CR - 12. Gower E, Estes C, Blach S, Razavi-Shearer K, Razavi H. Global epidemiology and genotype distribution of the hepatitis C virus infection. J Hepatol 2014 61(1):S45-57. [CrossRef] google scholar CR - 13. El-Serag HB. Epidemiologyofviralhepatitis andhepatocellular carcinoma. Gastroenterology 2012;142(6):1264-73. [CrossRef] google scholar CR - 14. Axley P, Ahmed Z, Ravi S, Singal AK. Hepatitis C virus and hepatocellular carcinoma: a narrative review. J Clin Transl Hepatol 2018;6(1):79. [CrossRef] google scholar CR - 15. Hu B, Ma X, Fu P, Sun Q, Tang W, Sun H, et al. miRNA-mRNA regulatory network and factors associated with prediction of prognosis in hepatocellular carcinoma. Genomics Proteomics Bioinformatics 2021;S1672-0229(21)00059-0. [Epub ahead of print] [CrossRef] google scholar CR - 16. Ho DW, Lo RC, Chan LK, Ng IO. Molecular pathogenesis of hepatocellular carcinoma. Liver Cancer 2016;5(4):290-302. [CrossRef] google scholar CR - 17. Polikar R. Ensemble learning. Ensemble machine learning. Springer; 2012: pp. 1-34. [CrossRef] google scholar CR - 18. Akman M, Genç Y, Ankarali H. Random forests yöntemi ve saglık alanında bir uygulama/Random forests methods and an application in health science. Turkiye Klinikleri J Biostat 2011;3(1):36-48. google scholar CR - 19. Pinero F, Dirchwolf M, Pessoa MG. Biomarkers in hepatocellular carcinoma: diagnosis, prognosis and treatment response assessment. Cells 2020;9(6):1370. [CrossRef] google scholar CR - 20. Plissonnier M-L, Herzog K, Levrero M, Zeisel MB. Non-coding RNAs and hepatitis C virus-induced hepatocellular carcinoma. Viruses 2018;10(11):591. [CrossRef] google scholar CR - 21. Hashem S, ElHefnawi M, Habashy S, El-Adawy M, Esmat G, Elakel W, et al. Machine learning prediction models for diagnosing hepatocellular carcinoma with HCV-related chronic liver disease. Comput Methods Programs Biomed 2020;196:105551. [CrossRef] google scholar CR - 22. Ye Q-H, Qin L-X, Forgues M, He P, Kim JW, Peng AC, et al. Predicting hepatitis B virus-positive metastatic hepatocellular carcinomas using gene expression profiling and supervised machine learning. Nat Med 2003;9(4):416-23. [CrossRef] google scholar CR - 23. Ueda T, Honda M, Horimoto K, Aburatani S, Saito S, Yamashita T, et al. Gene expression profiling of hepatitis B-and hepatitis C-related hepatocellular carcinoma using graphical Gaussian modeling. Genomics 2013;101(4):238-48. [CrossRef] google scholar CR - 24. Chang HY, Thomson JA, Chen X. Microarray analysis of stem cells and differentiation. Methods Enzymol 2006;420:225-54. [CrossRef] google scholar CR - 25. Saeys Y, Inza I, Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics 2007;23(19):2507-17. [CrossRef] google scholar CR - 26. Fodor IK. A Survey of Dimension Reduction Techniques. Lawrence Livermore National Lab (CA). Department of Energy (US). 2002 May. Report No: UCRL-ID-148494 TRN: US200408%%150. [CrossRef] google scholar CR - 27. Fonti V. Research Paper in Business Analytics: Feature Selection with LASSO. Amsterdam: VU Amsterdam 2017. google scholar CR - 28. Wang J, Li P, Ran R, Che Y, Zhou Y. A short-term photovoltaic power prediction model based on the gradient boost decision tree. Appl Sci 2018;8(5):689. [CrossRef] google scholar CR - 29. Salam Patrous Z. Evaluating XGBoost for User Classification by using Behavioral Features Extracted from Smartphone Sensors (dissertation). Stockholm: KTH Royal Institute of Technology. 2018. google scholar CR - 30. Dikker J. Boosted tree learning for balanced item recommendation in online retail (dissertation). Eindhoven: The Eindhoven University of Technology. 2017. google scholar CR - 31. Smyth GK. Limma: linear models for microarray data. Bioinformatics and computational biology solutions using R and Bioconductor. In: Gail M, Samet JM. Statistics for Biology and Health. Springer; 2005: pp. 397-420. [CrossRef] google scholar CR - 32. Yan H, Zheng G, Qu J, Liu Y, Huang X, Zhang E, et al. Identification of key candidate genes and pathways in multiple myeloma by integrated bioinformatics analysis. J Cell Physiol 2019; 234(12):23785-97. [CrossRef] google scholar CR - 33. Cevallos M, Egger M, Moher D. STROBE (Strengthening the Reporting of Observational Studies in Epidemiology). In: Moher D, Altman DG, Schulz KF, Simera I, Wager E, editors. Guidelines for Reporting Health Research: A User’s Manual. John Wiley & Sons, Ltd; 2014. p. 169-79. [CrossRef] google scholar CR - 34. Yang JD, Hainaut P, Gores GJ, Amadou A, Plymoth A, Roberts LR. A global view of hepatocellular carcinoma: trends, risk, prevention and management. Nat Rev Gastroenterol Hepatol 2019;16(10):589-604. [CrossRef] google scholar CR - 35. Tang A, Hallouch O, Chernyak V, Kamaya A, Sirlin CB. Epidemiology of hepatocellular carcinoma: target population for surveillance and diagnosis. Abdom Radiol (NY) 2018;43(1):13-25. [CrossRef] google scholar CR - 36. Park JW, Chen M, Colombo M, Roberts LR, Schwartz M, Chen PJ, et al. Global patterns of hepatocellular carcinoma management from diagnosis to death: the BRIDGE Study. Liver Int 2015;35(9):2155-66. [CrossRef] google scholar CR - 37. Yang JD, Gyedu A, Afihene MY, Duduyemi BM, Micah E, Kingham PT, et al. Hepatocellular carcinoma occurs at an earlier age in Africans, particularly in association with chronic hepatitis B. Am J Gastroenterol 2015;110(11):1629-31. [CrossRef] google scholar CR - 38. Jefferies M, Rauff B, Rashid H, Lam T, Rafiq S. Update on global epidemiology of viral hepatitis and preventive strategies. World J Clin Cases 2018;6(13):589-99. [CrossRef] google scholar CR - 39. Hill AM, Nath S, Simmons B. The road to elimination of hepatitis C: analysis of cures versus new infections in 91 countries. J Virus Erad 2017;3(3):117-23. [CrossRef] google scholar CR - 40. Marshall AD, Pawlotsky J-M, Lazarus JV, Aghemo A, Dore GJ, Grebely J. The removal of DAA restrictions in Europe-one step closer to eliminating HCV as a major public health threat. J Hepatol 2018;69(5):1188-96. [CrossRef] google scholar CR - 41. Page K, Melia MT, Veenhuis RT, Winter M, Rousseau KE, Massaccesi G, et al. Randomized trial of a vaccine regimen to prevent chronic HCV infection. N Engl J Med 2021;384(6):541-9. [CrossRef] google scholar CR - 42. Ghidini M, Braconi C. Non-coding RNAs in primary liver cancer. Front Med (Lausanne) 2015;2:36. [CrossRef] google scholar CR - 43. Xie Q, Fan F, Wei W, Liu Y, Xu Z, Zhai L, et al. Multi-omics analyses reveal metabolic alterations regulated by hepatitis B virus core protein in hepatocellular carcinoma cells. Sci Rep 2017;7(1):41089. [CrossRef] google scholar CR - 44. Li H, Zhu W, Zhang L, Lei H, Wu X, Guo L, et al. The metabolic responses to hepatitis B virus infection shed new light on pathogenesis and targets for treatment. Sci Rep 2015;5(1):8421. [CrossRef] google scholar CR - 45. Gao Q, Zhu H, Dong L, Shi W, Chen R, Song Z, et al. Integrated proteogenomic characterization of HBV-related hepatocellular carcinoma. Cell 2019;179(2):561-77. [CrossRef] google scholar CR - 46. Wei X, Su R, Yang M, Pan B, Lu J, Lin H, et al. Quantitative proteomic profiling of hepatocellular carcinoma at different serum alpha-fetoprotein level. Translational oncology 2022;20:101422. [CrossRef] google scholar CR - 47. Lu M, Kong X, Wang H, Huang G, Ye C, He Z. A novel microRNAs expression signature for hepatocellular carcinoma diagnosis and prognosis. Oncotarget 2017;8(5):8775-84. [CrossRef] google scholar CR - 48. El Khoury W, Nasr Z. Deregulation of ribosomal proteins in human cancers. Biosci Rep 2021;41(12):BSR20211577. [CrossRef] google scholar CR - 49. Li F, Deng Y, Zhang S, Zhu B, Wang J, Wang J, et al. Human hepatocyte-enriched miRNA-192-3p promotes HBV replication through inhibiting Akt/mTOR signalling by targeting ZNF143 in hepatic cell lines. Emerg Microbes Infect 2022;11(1):616-28. [CrossRef] google scholar CR - 50. Akbulut S, Garzali IU, Hargura AS, Aloun A, Yilmaz S. Screening, Surveillance, and Management of Hepatocellular Carcinoma During the COVID-19 Pandemic: a Narrative Review. J Gastrointest Cancer 2022:1-12. [CrossRef] google scholar UR - https://doi.org/10.26650/IUITFD.1130442 L1 - https://dergipark.org.tr/en/download/article-file/2485309 ER -