TY - JOUR T1 - Makine Öğrenmesi Modelleri ile Kanser Hücre Hattı–İlaç Etkileşim Tahmini TT - Cancer Cell Line–Drug Interaction Prediction with Machine Learning Models AU - Özcan, Gıyasettin AU - Mergen, Berfu PY - 2025 DA - June Y2 - 2024 DO - 10.35414/akufemubid.1555642 JF - Afyon Kocatepe Üniversitesi Fen Ve Mühendislik Bilimleri Dergisi PB - Afyon Kocatepe Üniversitesi WT - DergiPark SN - 2149-3367 SP - 695 EP - 703 VL - 25 IS - 3 LA - tr AB - Bu çalışmada, küçük hücreli akciğer karsinomunda farmakogenomik etkileşimleri analiz edilmiştir. Bu analiz sonucunda ilaçların mutasyon yüküne bağlı olarak duyarlılığını makine öğrenmesi yöntemleri ile tahmin edilmesini sağlayacak veri toplama, manipülasyon ve model geliştirme süreçleri yapılmıştır. Sanger Enstitüsü tarafından sunulan açık kaynaklı üç ayrı veri kümesi birleştirilerek yeni bir veri kümesi türetilmiştir. İlk veri kaynağı hücre hatları ve bunların mutasyon bilgilerini içermektedir. İkinci veri kaynağı hücre hatlarına ait detaylı bilgileri içermektedir. Üçüncü veri kaynağı ise ilaç-hücre etkileşimlerini ve hücre hatlarına karşı ilaç duyarlılığını içermektedir. Birleştirilen verilerden farklı mutasyon yük bilgilerinin sayılarak ilaç bileşikleri, hücre hatları, mutasyon yükleri, doku ve IC50 özellikleri tek bir veri kümesinde toplanmıştır. Çalışmanın ikinci aşamasında, türetilen veri makine öğrenmesinde kullanılmış ve mutasyon yüküne göre ilaç direnci etkisi tahmin edilmiştir. Bu amaçla, tahmin için üç farklı makine öğrenmesi algoritması test edilmiştir. Makine öğrenmesi performans analizi için RMSE, R2 ve MAE sonuçları bulunmuş ve karşılaştırılmıştır. Elde edilen sonuçlara göre geliştirdiğimiz XGBoost makine öğrenmesi modeli hücre-ilaç arasındaki IC50 skorunu anlamlı oranda tahmin etmiştir. Bu sayede ilaçların mutasyonlara direncine ve etkisine dair ön bilgi sunulmaktadır. Bunun yanı sıra çalışmada hangi mutasyon türlerinin nicel sayısının ilaç direncinde daha fazla etki gösterdiğini makine öğrenmesi analizleri ile sunulmuştur. KW - İlaç Duyarlılığı Tahmini; Mutasyon Yükü ; Makine Öğrenme; Kişiselleştirilmiş Tıp; XGBoost KW - İlaç duyarlılığı tahmini KW - Mutasyon yükü KW - Makine Öğrenmesi KW - Kişiselleştirilmiş tıp N2 - In this study, we addressed pharmacogenomic interactions in lung small cell carcinoma. For this purpose, data collection, data manipulation and machine learning algorithms were utilized. By combining three open-source datasets, a new dataset is generated. The first data source contains cell lines and their mutation information. The second data source contains detailed information about the cell lines. The third dataset contains drug-cell interactions and drug sensitivity of cell lines. By combining the utilized data sources, a new dataset was obtained by counting different mutation load information. Thus, chemical compounds, cell lines, mutation loads, tissue and IC50 characteristics were collected in a single dataset. In the second phase of the study, the derived data were used in machine learning to predict the mutation load effect on drug resistance. For this purpose, three different machine learning algorithms were tested for prediction. For machine learning performance analysis, RMSE, R2 and MAE results were found and compared. According to the results, the XGBoost machine learning model we developed significantly predicts the IC50 score between cell-drug. In this way, it provides preliminary information on the extent to drug resistance and drug effect. In addition, the study presents which mutation types have a greater effect on the quantitative number of drug resistance through machine learning analysis. CR - Barretina, J., Caponigro, G., Stransky, N., Venkatesan, K., Margolin, A.A., Kim, S., Wilson, C.J., Lehár, J., Kryukov, G.V., Sonkin, D., Reddy, A., Liu, M., Murray, L., Berger, M.F., Monahan, J.E., Morais, P., Meltzer, J., Korejwa, A., Jané-Valbuena, J., Mapa, F.A., Thibault, J., Bric-Furlong, E., Raman, P., Shipway, A., Engels, I.H., Cheng, J., Yu, G.K., Yu, J., Aspesi, P. Jr., de Silva, M., Jagtap, K., Jones, M.D., Wang, L., Hatton, C., Palescandolo, E., Gupta, S., Mahan, S., Sougnez, C., Onofrio, R.C., Liefeld, T., MacConaill, L., Winckler, W., Reich, M., Li, N., Mesirov, J.P., Gabriel, S.B., Getz, G., Ardlie, K., Chan, V., Myer, V.E., Weber, B.L., Porter, J., Warmuth, M., Finan,P., Harris, J.L., Meyerson, M., Golub, T.R., Morrissey, M.P., Sellers, W.R., Schlegel, R., Garraway, L.A. 2012. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature, 483(7391), 603-7. https://doi.org/10.1038/nature11003 CR - Bashi, A.C., Coker, E.A., Bulusu, K.C., Jaaks, P., Crafter, C., Lightfoot, H., Milo, M., McCarten, K., Jenkins, D.F., van der Meer, D., Lynch, J.T., Barthorpe, S., Andersen, C.L., Barry, S.T., Beck, A., Cidado, J., Gordon, J.A., Hall, C., Hall, J., Mali, I., Mironenko, T., Mongeon, K., Morris, J., Richardson, L., Smith, P.D., Tavana, O., Tolley, C., Thomas, F., Willis, B.S., Yang, W., O'Connor, M.J., McDermott, U., Critchlow, S.E., Drew, L., Fawell, S.E., Mettetal J.T., Garnett, M.J., 2024. Large-scale Pan-cancer Cell Line Screening Identifies Actionable and Effective Drug Combinations. Cancer Discov, 14(5), 846-865. https://doi.org/10.1158/2159-8290.CD-23-0388. CR - Berlow, N., Haider, S., Wan, Q., Geltzeiler, M., Davis, LE., Keller, C., Pal, R. 2014. An Integrated Approach to Anti-Cancer Drug Sensitivity Prediction. IEEE/ACM Trans Comput Biol Bioinform. 11(6) , 995-1008. https://doi.org/10.1109/TCBB.2014.2321138. CR - Berrouet, C., Dorilas, N., Rejniak, K.A., Tuncer, N. 2020.Comparison of Drug Inhibitory Effects ([Formula: see text]) in Monolayer and Spheroid Cultures. Bull Math Biol. 82(6), 68. https://doi.org/10.1007/s11538-020-00746-7 CR - Bomane, A , Gonçalves, A., Ballester, P.J. 2019. Paclitaxel Response Can Be Predicted With Interpretable Multi-Variate Classifiers Exploiting DNA-Methylation and miRNA Data. Front Genet. 10:1041. https://doi.org/10.3389/fgene.2019.01041 CR - Boser, B.E., Guyon, I.M., Vapnik, V.N. 1992. A training algorithm for optimal margin classifier. Proceedings of the 5th ACM Workshop Pennsylvania, USA. 144–152, https:///doi.org/10.1145/130385.130401 CR - Cecchin, E. and Stocco, G.,2020. Pharmacogenomics and Personalized Medicine. Genes (Basel), 11(6), 679. https://doi.org/10.3390/genes11060679 CR - Chen, T. and Guestrin, C., XGBoost. 2016. A scalable tree boosting system, In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining San Francisco, CA, USA. 785–794 https://doi.org/10.1145/2939672.2939785 CR - Chiu, Y.C., Chen, H.H., Zhang, T., Zhang, S., Gorthi, A., Wang, L.J., Huang, Y., Chen, Y. 2019. Predicting drug response of tumors from integrated genomic profiles by deep neural networks. BMC Med Genomics, 12( 1)18. https://doi.org/10.1186/s12920-018-0460-9 CR - Costello, J.C., Heiser, L.M., Georgii, E., Gönen, M., Menden, M.P., Wang, N.J., Bansal, M., Ammad-ud-din, M., Hintsanen, P., Khan, S.A., Mpindi, J.P., Kallioniemi, O., Honkela, A., Aittokallio, T., Wennerberg, K., NCI DREAM Community, Collins, J.J., Gallahan, D., Singer, D., Saez-Rodriguez, J., Kaski, S., Gray, J.W., Stolovitzky, G. 2014. A community effort to assess and improve drug sensitivity prediction algorithms. Nat Biotechnol. 32(12):1202-12. https://doi.org/10.1038/nbt.2877. CR - Fearon, E.R., Vogelstein, B.1990 A genetic model for colorectal tumorigenesis. Cell, 61(5),759-67. https://doi.org/10.1016/0092-8674(90)90186-i. CR - Gao, Y., Lyu, Q., Luo, P., Li, M., Zhou, R., Zhang, J., & Lyu, Q. 2021. Applications of Machine Learning to Predict Cisplatin Resistance in Lung Cancer. International Journal of General Medicine, 14, 5911–5925. https://doi.org/10.2147/IJGM.S329644 CR - Geeleher, P., Zhang, Z., Wang, F., Gruener, R.F., Nath, A., Morrison, G., Bhutra, S., Grossman, R.L., Huang, R.S. 2017. Discovering novel pharmacogenomic biomarkers by imputing drug response in cancer patients from large genomics studies. Genome Res, 27(10), 1743-1751. https://doi.org/10.1101/gr.221077.117 CR - Gillet, J.P., Calcagno A.M., Varma, S., Marin,o M., Green, L.J., Vora, M.I., Patel C., Orina J.N., Eliseeva, T.A., Singal, V., Padmanabhan, R., Davidson, B., Ganapathi, R., Sood A.K., Rued,a B.R., Ambudkar S.V., Gottesman, M.M. 2011. Redefining the relevance of established cancer cell lines to the study of mechanisms of clinical anti-cancer drug resistance. Proc Natl Acad Sci U S A. 2011 Nov 15;108(46):18708-13. https://doi.org/10.1073/pnas.1111840108 CR - Gönen, M. and Margolin, A.A. 2014. Drug susceptibility prediction against a panel of drugs using kernelized Bayesian multitask learning. Bioinformatics, 30(17), 556-63. https://doi.org/10.1093/bioinformatics/btu464 CR - Hastie, T., Tibshirani, R., Friedman, J., 2009. The elements of statistical learning: data mining, inference, and prediction. Springer, New York, 150-190 CR - Joo, M., Park, A., Kim, K., Son, W-J., Lee, H.S., Lim, G., Lee, J., Lee, D.H., An, J., Kim, J.H. 2019 A Deep Learning Model for Cell Growth Inhibition IC50 Prediction and Its Application for Gastric Cancer Patients. International Journal of Molecular Sciences. 20(24):6276. https://doi.org/10.3390/ijms20246276 CR - Langdon, S.P. 2004. Cell culture contamination: an overview. Methods Mol Med. 2004, 88:309-17. https://doi.org/10.1385/1-59259-406-9:309 CR - Lecun, Y., Bottou, L., Bengio, Y., P. Haffner, 1998. "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, 86, 11, 2278-2324 https://doi.org/10.1109/5.726791 CR - Masters, J.R.2000. Human cancer cell lines: fact and fantasy. Nat Rev Mol Cell Biol. 1(3), 233-6. https://doi.org/10.1038/35043102 CR - Matlock, K., De Niz, C., Rahman, R., Ghosh, S., Pal, R. 2018. Investigation of model stacking for drug sensitivity prediction. BMC Bioinformatics, 19( 3), 71. https://doi.org/10.1186/s12859-018-2060-2 CR - Park, A., Joo, M., Kim, K., Son, W.J., Lim, G., Lee, J., Kim, J.H., Lee, D.H., Nam, S. 2022. A comprehensive evaluation of regression-based drug responsiveness prediction models, using cell viability inhibitory concentrations (IC50 values). Bioinformatics, 38(10), 2810-2817. https://doi.org/10.1093/bioinformatics/btac177 CR - Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., Gulin, A., 2018. CatBoost: unbiased boosting with categorical features, Advances in neural information processing systems, 31 CR - Roden, D.M., McLeod, H.L., Relling, M.V., Williams, M.S., Mensah, G.A., Peterson, J.F., Van Driest, S.L., 2019 Pharmacogenomics. Lancet., 394(10197):521-532. https://doi.org/10.1016/S0140-6736(19)31276-0 CR - Sharma, A. and Rani, R. 2020. Ensembled machine learning framework for drug sensitivity prediction. IET Syst Biol, 14(1), 39-46. https://doi.org/10.1049/iet-syb.2018.5094 CR - Singh, D. P. and Kaushik, B. 2023. A systematic literature review for the prediction of anticancer drug response using various machine‐learning and deep‐learning techniques. Chemical Biology & Drug Design, 101(1), 175-194. Steele, F.R. 2009. Personalized medicine: something old, something new. Per Med., ,6(1):1-5. https://doi.org/10.2217/17410541.6.1.1 CR - Vis, D.J., Bombardelli, L., Lightfoot, H., Iorio, F., Garnett, M.J., Wessels, L.F. 2016. Multilevel models improve precision and speed of IC50 estimates. Pharmacogenomics. 2016 May, 17(7), 691-700. https://doi.org/10.2217/pgs.16.15 CR - Wang, H., Xi, J., Wang, M., Li, A. 2020. Dual-Layer Strengthened Collaborative Topic Regression Modeling for Predicting Drug Sensitivity. IEEE/ACM Trans Comput Biol Bioinform., 2020 Mar-Apr, 17(2):587-598. https://doi.org/10.1109/TCBB.2018.2864739 CR - Wei, D., Liu, C., Zheng, X., Li, Y. 2019. Comprehensive anticancer drug response prediction based on a simple cell line-drug complex network model. BMC Bioinformatics. 2019 Jan 22, 20(1):44. UR - https://doi.org/10.35414/akufemubid.1555642 L1 - https://dergipark.org.tr/tr/download/article-file/4239033 ER -