The Comparison of Machine Learning Algorithms for Microbiome Data

Özlem Akay; Gülfer Yakici

doi:10.56061/fbujohs.1636654

Araştırma Makalesi

The Comparison of Machine Learning Algorithms for Microbiome Data

Yıl 2025, Cilt: 5 Sayı: 2, 206 - 224, 29.08.2025

Özlem Akay , Gülfer Yakici

https://doi.org/10.56061/fbujohs.1636654

Öz

The application of next-generation sequencing (NGS) technologies has enabled the identification of both culturable and non-culturable microorganisms in blood samples, revealing their potential roles in systemic infections and immune responses. However, the complexity and high dimensionality of microbiome data present significant challenges for analysis. In this study, it was evaluated the performance of various machine learning (ML) algorithms, including logistic regression, random forest (RF), decision tree, and support vector machines (SVM), in classifying 16S rRNA gene sequencing data of blood microbiota into cultured and uncultured groups. The dataset used in this study, obtained from Kalfin and Panaiotov, consists of 16S rRNA gene sequences from a total of 18,093 OTUs and 62 observations, including control samples. After excluding the six control samples, 56 samples from target sequencing of cultured and non-cultured blood samples of healthy individuals were analyzed. Results show that the random forest (RF) algorithm exhibits the highest classification performance, successfully distinguishing between cultured and uncultured blood microbiota. In the study, the potential of ML techniques in microbiome research was evaluated and the effectiveness and accuracy of these techniques in the analysis of microbiome data were investigated.

Anahtar Kelimeler

Machine Learning , Microbiome , Blood Microbiota , Metagenomics

Kaynakça

Abdulqader, Q. M. (2017). Applying the Binary Logistic Regression Analysis on The Medical Data. Science Journal of University of Zakho, 5, 330-334. https://doi.org/10.25271/2017.5.4.388
Aggarwal, N., Kitano, S., Puah, G. R. Y., Kittelmann, S., Hwang, I. Y., & Chang, M. W. (2023). Microbiome and Human Health: Current Understanding, Engineering, and Enabling Technologies. Chem Rev, 123(1), 31-72. https://doi.org/10.1021/acs.chemrev.2c00431
Alam, M. S., & Vuong, S. T. (2013). Random forest classification for detecting android malware. 2013 IEEE international conference on green computing and communications and IEEE Internet of Things and IEEE cyber, physical and social computing, Beijing, China, 2013, pp. 663-669. https://doi.org/10.1109/GreenCom-iThings-CPSCom.2013.122
Alon, T. (2023). Ultimate Guide to PR-AUC: Calculations, uses, and limitations. Retrieved 05.06.2024 from https://www.aporia.com/learn/ultimate-guide-to-precision-recall-auc-understanding-calculating-using-pr-auc-in-ml/
AUC, C. (2024). Retrieved 05.06.2024 from https://cran.r-project.org/web/packages/mikropml/vignettes/introduction.html
Bangdiwala, S. I. (2018). Regression: binary logistic. Int J Inj Contr Saf Promot, 25(3), 336-338. https://doi.org/10.1080/17457300.2018.1486503
Beck, D., & Foster, J. A. (2015). Machine learning classifiers provide insight into the relationship between microbial communities and bacterial vaginosis. BioData mining, 8, 1-9. https://doi.org/10.1186/s13040-015-0055-3
Begg, R. K., Palaniswami, M., & Owen, B. (2005). Support vector machines for automated gait classification. IEEE transactions on Biomedical Engineering, 52(5), 828-838. https://doi.org/10.1109/TBME.2005.845241
Bhavsar, H., & Panchal, M. H. (2012). A review on support vector machine for data classification. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 1(10), 185-189.
Bilgin, B., & Hanci, H. (2023). Gut Microbiota and Its Importance for Our Health. Pharmata, 3(3), 71-73. https://doi.org/10.5152/Pharmata.2023.1318972
Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
Buckley, S. J., & Harvey, R. J. (2021). Lessons Learnt From Using the Machine Learning Random Forest Algorithm to Predict Virulence in Streptococcus pyogenes [Mini Review]. Frontiers in Cellular and Infection Microbiology, 11. https://doi.org/10.3389/fcimb.2021.809560
Cheng, H. S., Tan, S. P., Wong, D. M. K., Koo, W. L. Y., Wong, S. H., & Tan, N. S. (2023). The Blood Microbiome and Health: Current Evidence, Controversies, and Challenges. Int J Mol Sci, 24(6). https://doi.org/10.3390/ijms24065633
Chidambaram, S., & Srinivasagan, K. (2019). Performance evaluation of support vector machine classification approaches in data mining. Cluster Computing, 22, 189-196. https://doi.org/10.1007/s10586-018-2036-z
Ciftciler, R., & Ciftciler, A. E. (2022). The importance of microbiota in hematology. Transfus Apher Sci, 61(2), 103320. https://doi.org/10.1016/j.transci.2021.103320
Costa, S. P., & Carvalho, C. M. (2022). Burden of bacterial bloodstream infections and recent advances for diagnosis. Pathog Dis, 80(1). https://doi.org/10.1093/femspd/ftac027
D'Elia, D., Truu, J., Lahti, L., Berland, M., Papoutsoglou, G., Ceci, M., Zomer, A., Lopes, M. B., Ibrahimi, E., Gruca, A., Nechyporenko, A., Frohme, M., Klammsteiner, T., Pau, E. C. S., Marcos-Zambrano, L. J., Hron, K., Pio, G., Simeon, A., Suharoschi, R., . . . Claesson, M. J. (2023). Advancing microbiome research with machine learning: key findings from the ML4Microbiome COST action. Front Microbiol, 14, 1257002. https://doi.org/10.3389/fmicb.2023.1257002
Dembla, G. (2020). Intuition behind Log-loss score. Towards Data Science.
Demirci, M., Saribas, A. S., Siadat, S. D., & Kocazeybek, B. S. (2023). Editorial: Blood microbiota in health and disease. Front Cell Infect Microbiol, 13, 1187247. https://doi.org/10.3389/fcimb.2023.1187247
Emery, D. C., Cerajewska, T. L., Seong, J., Davies, M., Paterson, A., Allen-Birt, S. J., & West, N. X. (2020). Comparison of Blood Bacterial Communities in Periodontal Health and Periodontal Disease. Front Cell Infect Microbiol, 10, 577485. https://doi.org/10.3389/fcimb.2020.577485
Freitas, P., Silva, F., Sousa, J. V., Ferreira, R. M., Figueiredo, C., Pereira, T., & Oliveira, H. P. (2023). Machine learning-based approaches for cancer prediction using microbiome data. Scientific reports, 13(1), 11821. https://doi.org/10.1038/s41598-023-38670-0
Gholamy, A., Kreinovich, V., & Kosheleva, O. (2018). Why 70/30 or 80/20 relation between training and testing sets: A pedagogical explanation. Int. J. Intell. Technol. Appl. Stat, 11(2), 105-111. https://doi.org/10.6148/IJITAS.201806_11(2).0003
Github-1. (2024). Retrieved 06.12.2024 from https://github.com/yhodzhev/blood_microbiota
Goraya, M. U., Li, R., Mannan, A., Gu, L., Deng, H., & Wang, G. (2022). Human circulating bacteria and dysbiosis in non-infectious diseases. Front Cell Infect Microbiol, 12, 932702. https://doi.org/10.3389/fcimb.2022.932702
Gotschlich, E. C., Colbert, R. A., & Gill, T. (2019). Methods in microbiome research: Past, present, and future. Best Pract Res Clin Rheumatol, 33(6), 101498. https://doi.org/10.1016/j.berh.2020.101498
Jijo, B., & Abdulazeez, A. (2021). Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2, 20-28. https://doi.org/10.38094/jastt20165
Kajihara, M., Koido, S., Kanai, T., Ito, Z., Matsumoto, Y., Takakura, K., Saruta, M., Kato, K., Odamaki, T., Xiao, J. Z., Sato, N., & Ohkusa, T. (2019). Characterisation of blood microbiota in patients with liver cirrhosis. Eur J Gastroenterol Hepatol, 31(12), 1577-1583. https://doi.org/10.1097/MEG.0000000000001494
Khan, I., Khan, I., Jianye, Z., Xiaohua, Z., Khan, M., Hilal, M. G., Kakakhel, M. A., Mehmood, A., Lizhe, A., & Zhiqiang, L. (2022). Exploring blood microbial communities and their influence on human cardiovascular disease. J Clin Lab Anal, 36(4), e24354. https://doi.org/10.1002/jcla.24354
Khan, I., Khan, I., Usman, M., Xiao Wei, Z., Ping, X., Khan, S., Khan, F., Jianye, Z., Zhiqiang, L., & Lizhe, A. (2022). Circulating microbiota and metabolites: Insights into cardiovascular diseases. J Clin Lab Anal, 36(12), e24779. https://doi.org/10.1002/jcla.24779
Kim, H., Na, J. E., Kim, S., Kim, T. O., Park, S. K., Lee, C. W., Kim, K. O., Seo, G. S., Kim, M. S., Cha, J. M., Koo, J. S., & Park, D. I. (2023). A Machine Learning-Based Diagnostic Model for Crohn's Disease and Ulcerative Colitis Utilizing Fecal Microbiome Analysis. Microorganisms, 12(1). https://doi.org/10.3390/microorganisms12010036
Kolena, T. w. (2024). Retrieved 06.12.2024 from https://docs.kolena.com/metrics/cohens-kappa/
Kouchaki, S., Yang, Y., Lachapelle, A., Walker, T. M., Walker, A. S., , C. C., Peto, T. E. A., Crook, D. W., & Clifton, D. A. (2020). Multi-Label Random Forest Model for Tuberculosis Drug Resistance Classification and Mutation Ranking [Original Research]. Frontiers in Microbiology, 11. https://doi.org/10.3389/fmicb.2020.00667
Huilgol, P. (2025). Retrieved 18.04.2025 from https://www.analyticsvidhya.com/articles/precision-and-recall-in-machine-learning/
Lee, E. J., Sung, J., Kim, H. L., & Kim, H. N. (2022). Whole-Genome Sequencing Reveals Age-Specific Changes in the Human Blood Microbiota. J Pers Med, 12(6). https://doi.org/10.3390/jpm12060939
Li, L., Mendis, N., Trigui, H., Oliver, J. D., & Faucher, S. P. (2014). The importance of the viable but non-culturable state in human bacterial pathogens. Front Microbiol, 5, 258. https://doi.org/10.3389/fmicb.2014.00258
Lin, Y., Lee, Y., & Wahba, G. (2002). Support vector machines for classification in nonstandard situations. Machine Learning, 46, 191-202. https://doi.org/10.1023/A:1012406528296
Liu, Y. X., Qin, Y., Chen, T., Lu, M., Qian, X., Guo, X., & Bai, Y. (2021). A practical guide to amplicon and metagenomic analysis of microbiome data. Protein Cell, 12(5), 315-330. https://doi.org/10.1007/s13238-020-00724-8
Log, G. (2024). Retrieved 05.06.2024 from https://rdrr.io/github/jeffreyevans/rfUtilities/man/logLoss.html
Loganathan, T., & Priya Doss, C. G. (2022). The influence of machine learning technologies in gut microbiome research and cancer studies- A review. Life Sci, 311(Pt A), 121118. https://doi.org/10.1016/j.lfs.2022.121118
Mair, R. D., & Sirich, T. L. (2019). Blood Microbiome in CKD: Should We Care? Clin J Am Soc Nephrol, 14(5), 648-649. https://doi.org/10.2215/CJN.03420319
Molina-Menor, E., Gimeno-Valero, H., Pascual, J., Pereto, J., & Porcar, M. (2020). High Culturable Bacterial Diversity From a European Desert: The Tabernas Desert. Front Microbiol, 11, 583120. https://doi.org/10.3389/fmicb.2020.583120
Nannapaneni, P., Hertwig, F., Depke, M., Hecker, M., Mäder, U., Völker, U., Steil, L., & van Hijum, S. (2012). Defining the structure of the general stress regulon of Bacillus subtilis using targeted microarray analysis and random forest classification. Microbiology (Reading), 158(Pt 3), 696-707. https://doi.org/10.1099/mic.0.055434-0
Olugbenga, M. (2024). Retrieved 06.12.2024 from https://neptune.ai/blog/balanced-accuracy
Othman, M. F. B., Abdullah, N. B., & Kamal, N. F. B. (2011). MRI brain classification using support vector machine. 2011 fourth international conference on modeling, simulation and applied optimization, Kuala Lumpur, Malaysia, 2011, pp. 1-4. https://doi.org/10.1109/ICMSAO.2011.5775605
Pal, M. (2005). Random forest classifier for remote sensing classification. International Journal of Remote Sensing, 26(1), 217-222. https://doi.org/10.1080/01431160412331269698
Panaiotov, S., Hodzhev, Y., Tsafarova, B., Tolchkov, V., & Kalfin, R. (2021). Culturable and Non-Culturable Blood Microbiota of Healthy Individuals. Microorganisms, 9(7). https://doi.org/10.3390/microorganisms9071464
Porras, A. M., & Brito, I. L. (2019). The internationalization of human microbiome research. Curr Opin Microbiol, 50, 50-55. https://doi.org/10.1016/j.mib.2019.09.012
Potgieter, M., Bester, J., Kell, D. B., & Pretorius, E. (2015). The dormant blood microbiome in chronic, inflammatory diseases. FEMS Microbiology Reviews, 39(4), 567-591. https://doi.org/10.1093/femsre/fuv013
Rai, K., Devi, M. S., & Guleria, A. (2016). Decision tree based algorithm for intrusion detection. International Journal of Advanced Networking and Applications, 7(4), 2828.
Rana, S., Midi, H. B., & Sarkar, S. K. (2010). Validation and Performance Analysis of Binary Logistic Regression Model.
Requena, T., & Velasco, M. (2021). The human microbiome in sickness and in health. Rev Clin Esp (Barc), 221(4), 233-240. https://doi.org/10.1016/j.rceng.2019.07.018
Saboo, K., Petrakov, N. V., Shamsaddini, A., Fagan, A., Gavis, E. A., Sikaroodi, M., McGeorge, S., Gillevet, P. M., Iyer, R. K., & Bajaj, J. S. (2022). Stool microbiota are superior to saliva in distinguishing cirrhosis and hepatic encephalopathy using machine learning. J Hepatol, 76(3), 600-607. https://doi.org/10.1016/j.jhep.2021.11.011
Sathiyanarayanan, P., Pavithra, S., M. Sai, S., & Makeswari, M. (2019, 29-30 March 2019). Identification of Breast Cancer Using The Decision Tree Algorithm. 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN)., https://doi.org/10.1109/ICSCAN.2019.8878757
Sharma, H., & Kumar, S. (2016). A survey on decision tree algorithms of classification in data mining. International Journal of Science and Research (IJSR), 5(4), 2094-2097.
Srivastava, T. (2024). Retrieved 06.12.2024 from https://www.analyticsvidhya.com/blog/2019/08/11-important-model-evaluation-error-metrics/
Tallón-Ballesteros, A. J., Fong, S., & Leal-Díaz, R. (2019). Does the order of attributes play an important role in classification? Hybrid Artificial Intelligent Systems: 14th International Conference, HAIS 2019, León, Spain, September 4–6, 2019, Proceedings 14. https://doi.org/10.1007/978-3-030-29859-3_32
Teixeira, M., Silva, F., Ferreira, R. M., Pereira, T., Figueiredo, C., & Oliveira, H. P. (2024). A review of machine learning methods for cancer characterization from microbiome data. NPJ Precision Oncology, 8(1), 123. https://doi.org/10.1038/s41698-024-00617-7
Timsit, J. F., Ruppe, E., Barbier, F., Tabah, A., & Bassetti, M. (2020). Bloodstream infections in critically ill patients: an expert statement. Intensive Care Med, 46(2), 266-284. https://doi.org/10.1007/s00134-020-05950-6
Topçuoğlu, B. D., Lapp, Z., Sovacool, K. L., Snitkin, E., Wiens, J., & Schloss, P. D. (2021). mikropml: User-Friendly R Package for Supervised Machine Learning Pipelines. J Open Source Softw, 6(61). https://doi.org/10.21105/joss.03073
Topçuoğlu, B. D., Lesniak, N. A., Ruffin IV, M. T., Wiens, J., & Schloss, P. D. (2020). A framework for effective application of machine learning to microbiome-based classification problems. MBio, 11(3). https://doi.org/10.1128/mbio.00434-20
Tsafarova, B., Hodzhev, Y., Yordanov, G., Tolchkov, V., Kalfin, R., & Panaiotov, S. (2022). Morphology of blood microbiota in healthy individuals assessed by light and electron microscopy. Front Cell Infect Microbiol, 12, 1091341. https://doi.org/10.3389/fcimb.2022.1091341
Wang, X.-W., & Liu, Y.-Y. (2020). Comparative study of classifiers for human microbiome data. Medicine in microecology, 4, 100013. https://doi.org/10.1016/j.medmic.2020.100013
Wilhelm, R. C., van Es, H. M., & Buckley, D. H. (2022). Predicting measures of soil health using the microbiome and supervised machine learning. Soil Biology and Biochemistry, 164, 108472. https://doi.org/10.1016/j.soilbio.2021.108472
Wu, Q., & Zhou, D.-X. (2006). Analysis of support vector machine classification. Journal of Computational Analysis & Applications, 8(2).
Zheng, D., Liwinski, T., & Elinav, E. (2020). Interaction between microbiota and immunity in health and disease. Cell Res, 30(6), 492-506. https://doi.org/10.1038/s41422-020-0332-7

Mikrobiyom Verileri için Makine Öğrenme Algoritmalarının Karşılaştırılması

Yıl 2025, Cilt: 5 Sayı: 2, 206 - 224, 29.08.2025

Özlem Akay , Gülfer Yakici

https://doi.org/10.56061/fbujohs.1636654

Öz

Yeni nesil dizileme (NGS) teknolojilerinin uygulanması, kan örneklerinde hem kültürlenebilen hem de kültürlenemeyen mikroorganizmaların tanımlanmasını sağlayarak, sistemik enfeksiyonlarda ve bağışıklık tepkilerinde potansiyel rollerini ortaya koymuştur. Ancak, mikrobiyom verilerinin karmaşıklığı ve yüksek boyutluluğu, analiz için önemli zorluklar sunmaktadır. Bu çalışmada, lojistik regresyon, rastgele orman (RF), karar ağacı ve destek vektör makineleri (SVM) dahil olmak üzere çeşitli makine öğrenimi (ML) algoritmalarının, kan mikrobiyotasının 16S rRNA gen dizileme verilerini kültürlenmiş ve kültürlenmemiş gruplara sınıflandırmadaki performansı değerlendirilmiştir. Çalışmada kullanılan veri seti, Kalfin ve Panaiotov’dan elde edilen 16S rRNA gen dizileri ile oluşturulmuş olup, toplamda 18.093 OTU ve 62 gözlem içermektedir; bunlar arasında kontrol örnekleri de bulunmaktadır. Altı kontrol örneği çalışmadan çıkarıldıktan sonra, sağlıklı bireylerden alınan kültürlü ve kültürsüz kan örneklerine ait 56 örnek üzerinde analizler yapılmıştır. Bulgular, rastgele orman (RF) algoritmasının en yüksek sınıflandırma performansını sergilediğini ve kültürlenmiş ve kültürlenmemiş kan mikrobiyotası arasında başarılı bir şekilde ayrım yaptığını göstermiştir. Çalışmada, mikrobiyom araştırmalarında ML tekniklerinin potansiyeli değerlendirilmiş ve bu tekniklerin mikrobiyom verilerinin analizindeki etkinliği ve doğruluğu, araştırılmıştır.

Anahtar Kelimeler

Makine öğrenmesi , Mikrobiom , Kan Mikrobiyatası , Metagenom

Kaynakça

Abdulqader, Q. M. (2017). Applying the Binary Logistic Regression Analysis on The Medical Data. Science Journal of University of Zakho, 5, 330-334. https://doi.org/10.25271/2017.5.4.388
Aggarwal, N., Kitano, S., Puah, G. R. Y., Kittelmann, S., Hwang, I. Y., & Chang, M. W. (2023). Microbiome and Human Health: Current Understanding, Engineering, and Enabling Technologies. Chem Rev, 123(1), 31-72. https://doi.org/10.1021/acs.chemrev.2c00431
Alam, M. S., & Vuong, S. T. (2013). Random forest classification for detecting android malware. 2013 IEEE international conference on green computing and communications and IEEE Internet of Things and IEEE cyber, physical and social computing, Beijing, China, 2013, pp. 663-669. https://doi.org/10.1109/GreenCom-iThings-CPSCom.2013.122
Alon, T. (2023). Ultimate Guide to PR-AUC: Calculations, uses, and limitations. Retrieved 05.06.2024 from https://www.aporia.com/learn/ultimate-guide-to-precision-recall-auc-understanding-calculating-using-pr-auc-in-ml/
AUC, C. (2024). Retrieved 05.06.2024 from https://cran.r-project.org/web/packages/mikropml/vignettes/introduction.html
Bangdiwala, S. I. (2018). Regression: binary logistic. Int J Inj Contr Saf Promot, 25(3), 336-338. https://doi.org/10.1080/17457300.2018.1486503
Beck, D., & Foster, J. A. (2015). Machine learning classifiers provide insight into the relationship between microbial communities and bacterial vaginosis. BioData mining, 8, 1-9. https://doi.org/10.1186/s13040-015-0055-3
Begg, R. K., Palaniswami, M., & Owen, B. (2005). Support vector machines for automated gait classification. IEEE transactions on Biomedical Engineering, 52(5), 828-838. https://doi.org/10.1109/TBME.2005.845241
Bhavsar, H., & Panchal, M. H. (2012). A review on support vector machine for data classification. International Journal of Advanced Research in Computer Engineering & Technology (IJARCET), 1(10), 185-189.
Bilgin, B., & Hanci, H. (2023). Gut Microbiota and Its Importance for Our Health. Pharmata, 3(3), 71-73. https://doi.org/10.5152/Pharmata.2023.1318972
Breiman, L. (2001). Random forests. Machine Learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324
Buckley, S. J., & Harvey, R. J. (2021). Lessons Learnt From Using the Machine Learning Random Forest Algorithm to Predict Virulence in Streptococcus pyogenes [Mini Review]. Frontiers in Cellular and Infection Microbiology, 11. https://doi.org/10.3389/fcimb.2021.809560
Cheng, H. S., Tan, S. P., Wong, D. M. K., Koo, W. L. Y., Wong, S. H., & Tan, N. S. (2023). The Blood Microbiome and Health: Current Evidence, Controversies, and Challenges. Int J Mol Sci, 24(6). https://doi.org/10.3390/ijms24065633
Chidambaram, S., & Srinivasagan, K. (2019). Performance evaluation of support vector machine classification approaches in data mining. Cluster Computing, 22, 189-196. https://doi.org/10.1007/s10586-018-2036-z
Ciftciler, R., & Ciftciler, A. E. (2022). The importance of microbiota in hematology. Transfus Apher Sci, 61(2), 103320. https://doi.org/10.1016/j.transci.2021.103320
Costa, S. P., & Carvalho, C. M. (2022). Burden of bacterial bloodstream infections and recent advances for diagnosis. Pathog Dis, 80(1). https://doi.org/10.1093/femspd/ftac027
D'Elia, D., Truu, J., Lahti, L., Berland, M., Papoutsoglou, G., Ceci, M., Zomer, A., Lopes, M. B., Ibrahimi, E., Gruca, A., Nechyporenko, A., Frohme, M., Klammsteiner, T., Pau, E. C. S., Marcos-Zambrano, L. J., Hron, K., Pio, G., Simeon, A., Suharoschi, R., . . . Claesson, M. J. (2023). Advancing microbiome research with machine learning: key findings from the ML4Microbiome COST action. Front Microbiol, 14, 1257002. https://doi.org/10.3389/fmicb.2023.1257002
Dembla, G. (2020). Intuition behind Log-loss score. Towards Data Science.
Demirci, M., Saribas, A. S., Siadat, S. D., & Kocazeybek, B. S. (2023). Editorial: Blood microbiota in health and disease. Front Cell Infect Microbiol, 13, 1187247. https://doi.org/10.3389/fcimb.2023.1187247
Emery, D. C., Cerajewska, T. L., Seong, J., Davies, M., Paterson, A., Allen-Birt, S. J., & West, N. X. (2020). Comparison of Blood Bacterial Communities in Periodontal Health and Periodontal Disease. Front Cell Infect Microbiol, 10, 577485. https://doi.org/10.3389/fcimb.2020.577485
Freitas, P., Silva, F., Sousa, J. V., Ferreira, R. M., Figueiredo, C., Pereira, T., & Oliveira, H. P. (2023). Machine learning-based approaches for cancer prediction using microbiome data. Scientific reports, 13(1), 11821. https://doi.org/10.1038/s41598-023-38670-0
Gholamy, A., Kreinovich, V., & Kosheleva, O. (2018). Why 70/30 or 80/20 relation between training and testing sets: A pedagogical explanation. Int. J. Intell. Technol. Appl. Stat, 11(2), 105-111. https://doi.org/10.6148/IJITAS.201806_11(2).0003
Github-1. (2024). Retrieved 06.12.2024 from https://github.com/yhodzhev/blood_microbiota
Goraya, M. U., Li, R., Mannan, A., Gu, L., Deng, H., & Wang, G. (2022). Human circulating bacteria and dysbiosis in non-infectious diseases. Front Cell Infect Microbiol, 12, 932702. https://doi.org/10.3389/fcimb.2022.932702
Gotschlich, E. C., Colbert, R. A., & Gill, T. (2019). Methods in microbiome research: Past, present, and future. Best Pract Res Clin Rheumatol, 33(6), 101498. https://doi.org/10.1016/j.berh.2020.101498
Jijo, B., & Abdulazeez, A. (2021). Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2, 20-28. https://doi.org/10.38094/jastt20165
Kajihara, M., Koido, S., Kanai, T., Ito, Z., Matsumoto, Y., Takakura, K., Saruta, M., Kato, K., Odamaki, T., Xiao, J. Z., Sato, N., & Ohkusa, T. (2019). Characterisation of blood microbiota in patients with liver cirrhosis. Eur J Gastroenterol Hepatol, 31(12), 1577-1583. https://doi.org/10.1097/MEG.0000000000001494
Khan, I., Khan, I., Jianye, Z., Xiaohua, Z., Khan, M., Hilal, M. G., Kakakhel, M. A., Mehmood, A., Lizhe, A., & Zhiqiang, L. (2022). Exploring blood microbial communities and their influence on human cardiovascular disease. J Clin Lab Anal, 36(4), e24354. https://doi.org/10.1002/jcla.24354
Khan, I., Khan, I., Usman, M., Xiao Wei, Z., Ping, X., Khan, S., Khan, F., Jianye, Z., Zhiqiang, L., & Lizhe, A. (2022). Circulating microbiota and metabolites: Insights into cardiovascular diseases. J Clin Lab Anal, 36(12), e24779. https://doi.org/10.1002/jcla.24779
Kim, H., Na, J. E., Kim, S., Kim, T. O., Park, S. K., Lee, C. W., Kim, K. O., Seo, G. S., Kim, M. S., Cha, J. M., Koo, J. S., & Park, D. I. (2023). A Machine Learning-Based Diagnostic Model for Crohn's Disease and Ulcerative Colitis Utilizing Fecal Microbiome Analysis. Microorganisms, 12(1). https://doi.org/10.3390/microorganisms12010036
Kolena, T. w. (2024). Retrieved 06.12.2024 from https://docs.kolena.com/metrics/cohens-kappa/
Kouchaki, S., Yang, Y., Lachapelle, A., Walker, T. M., Walker, A. S., , C. C., Peto, T. E. A., Crook, D. W., & Clifton, D. A. (2020). Multi-Label Random Forest Model for Tuberculosis Drug Resistance Classification and Mutation Ranking [Original Research]. Frontiers in Microbiology, 11. https://doi.org/10.3389/fmicb.2020.00667
Huilgol, P. (2025). Retrieved 18.04.2025 from https://www.analyticsvidhya.com/articles/precision-and-recall-in-machine-learning/
Lee, E. J., Sung, J., Kim, H. L., & Kim, H. N. (2022). Whole-Genome Sequencing Reveals Age-Specific Changes in the Human Blood Microbiota. J Pers Med, 12(6). https://doi.org/10.3390/jpm12060939
Li, L., Mendis, N., Trigui, H., Oliver, J. D., & Faucher, S. P. (2014). The importance of the viable but non-culturable state in human bacterial pathogens. Front Microbiol, 5, 258. https://doi.org/10.3389/fmicb.2014.00258
Lin, Y., Lee, Y., & Wahba, G. (2002). Support vector machines for classification in nonstandard situations. Machine Learning, 46, 191-202. https://doi.org/10.1023/A:1012406528296
Liu, Y. X., Qin, Y., Chen, T., Lu, M., Qian, X., Guo, X., & Bai, Y. (2021). A practical guide to amplicon and metagenomic analysis of microbiome data. Protein Cell, 12(5), 315-330. https://doi.org/10.1007/s13238-020-00724-8
Log, G. (2024). Retrieved 05.06.2024 from https://rdrr.io/github/jeffreyevans/rfUtilities/man/logLoss.html
Loganathan, T., & Priya Doss, C. G. (2022). The influence of machine learning technologies in gut microbiome research and cancer studies- A review. Life Sci, 311(Pt A), 121118. https://doi.org/10.1016/j.lfs.2022.121118
Mair, R. D., & Sirich, T. L. (2019). Blood Microbiome in CKD: Should We Care? Clin J Am Soc Nephrol, 14(5), 648-649. https://doi.org/10.2215/CJN.03420319
Molina-Menor, E., Gimeno-Valero, H., Pascual, J., Pereto, J., & Porcar, M. (2020). High Culturable Bacterial Diversity From a European Desert: The Tabernas Desert. Front Microbiol, 11, 583120. https://doi.org/10.3389/fmicb.2020.583120
Nannapaneni, P., Hertwig, F., Depke, M., Hecker, M., Mäder, U., Völker, U., Steil, L., & van Hijum, S. (2012). Defining the structure of the general stress regulon of Bacillus subtilis using targeted microarray analysis and random forest classification. Microbiology (Reading), 158(Pt 3), 696-707. https://doi.org/10.1099/mic.0.055434-0
Olugbenga, M. (2024). Retrieved 06.12.2024 from https://neptune.ai/blog/balanced-accuracy
Othman, M. F. B., Abdullah, N. B., & Kamal, N. F. B. (2011). MRI brain classification using support vector machine. 2011 fourth international conference on modeling, simulation and applied optimization, Kuala Lumpur, Malaysia, 2011, pp. 1-4. https://doi.org/10.1109/ICMSAO.2011.5775605
Pal, M. (2005). Random forest classifier for remote sensing classification. International Journal of Remote Sensing, 26(1), 217-222. https://doi.org/10.1080/01431160412331269698
Panaiotov, S., Hodzhev, Y., Tsafarova, B., Tolchkov, V., & Kalfin, R. (2021). Culturable and Non-Culturable Blood Microbiota of Healthy Individuals. Microorganisms, 9(7). https://doi.org/10.3390/microorganisms9071464
Porras, A. M., & Brito, I. L. (2019). The internationalization of human microbiome research. Curr Opin Microbiol, 50, 50-55. https://doi.org/10.1016/j.mib.2019.09.012
Potgieter, M., Bester, J., Kell, D. B., & Pretorius, E. (2015). The dormant blood microbiome in chronic, inflammatory diseases. FEMS Microbiology Reviews, 39(4), 567-591. https://doi.org/10.1093/femsre/fuv013
Rai, K., Devi, M. S., & Guleria, A. (2016). Decision tree based algorithm for intrusion detection. International Journal of Advanced Networking and Applications, 7(4), 2828.
Rana, S., Midi, H. B., & Sarkar, S. K. (2010). Validation and Performance Analysis of Binary Logistic Regression Model.
Requena, T., & Velasco, M. (2021). The human microbiome in sickness and in health. Rev Clin Esp (Barc), 221(4), 233-240. https://doi.org/10.1016/j.rceng.2019.07.018
Saboo, K., Petrakov, N. V., Shamsaddini, A., Fagan, A., Gavis, E. A., Sikaroodi, M., McGeorge, S., Gillevet, P. M., Iyer, R. K., & Bajaj, J. S. (2022). Stool microbiota are superior to saliva in distinguishing cirrhosis and hepatic encephalopathy using machine learning. J Hepatol, 76(3), 600-607. https://doi.org/10.1016/j.jhep.2021.11.011
Sathiyanarayanan, P., Pavithra, S., M. Sai, S., & Makeswari, M. (2019, 29-30 March 2019). Identification of Breast Cancer Using The Decision Tree Algorithm. 2019 IEEE International Conference on System, Computation, Automation and Networking (ICSCAN)., https://doi.org/10.1109/ICSCAN.2019.8878757
Sharma, H., & Kumar, S. (2016). A survey on decision tree algorithms of classification in data mining. International Journal of Science and Research (IJSR), 5(4), 2094-2097.
Srivastava, T. (2024). Retrieved 06.12.2024 from https://www.analyticsvidhya.com/blog/2019/08/11-important-model-evaluation-error-metrics/
Tallón-Ballesteros, A. J., Fong, S., & Leal-Díaz, R. (2019). Does the order of attributes play an important role in classification? Hybrid Artificial Intelligent Systems: 14th International Conference, HAIS 2019, León, Spain, September 4–6, 2019, Proceedings 14. https://doi.org/10.1007/978-3-030-29859-3_32
Teixeira, M., Silva, F., Ferreira, R. M., Pereira, T., Figueiredo, C., & Oliveira, H. P. (2024). A review of machine learning methods for cancer characterization from microbiome data. NPJ Precision Oncology, 8(1), 123. https://doi.org/10.1038/s41698-024-00617-7
Timsit, J. F., Ruppe, E., Barbier, F., Tabah, A., & Bassetti, M. (2020). Bloodstream infections in critically ill patients: an expert statement. Intensive Care Med, 46(2), 266-284. https://doi.org/10.1007/s00134-020-05950-6
Topçuoğlu, B. D., Lapp, Z., Sovacool, K. L., Snitkin, E., Wiens, J., & Schloss, P. D. (2021). mikropml: User-Friendly R Package for Supervised Machine Learning Pipelines. J Open Source Softw, 6(61). https://doi.org/10.21105/joss.03073
Topçuoğlu, B. D., Lesniak, N. A., Ruffin IV, M. T., Wiens, J., & Schloss, P. D. (2020). A framework for effective application of machine learning to microbiome-based classification problems. MBio, 11(3). https://doi.org/10.1128/mbio.00434-20
Tsafarova, B., Hodzhev, Y., Yordanov, G., Tolchkov, V., Kalfin, R., & Panaiotov, S. (2022). Morphology of blood microbiota in healthy individuals assessed by light and electron microscopy. Front Cell Infect Microbiol, 12, 1091341. https://doi.org/10.3389/fcimb.2022.1091341
Wang, X.-W., & Liu, Y.-Y. (2020). Comparative study of classifiers for human microbiome data. Medicine in microecology, 4, 100013. https://doi.org/10.1016/j.medmic.2020.100013
Wilhelm, R. C., van Es, H. M., & Buckley, D. H. (2022). Predicting measures of soil health using the microbiome and supervised machine learning. Soil Biology and Biochemistry, 164, 108472. https://doi.org/10.1016/j.soilbio.2021.108472
Wu, Q., & Zhou, D.-X. (2006). Analysis of support vector machine classification. Journal of Computational Analysis & Applications, 8(2).
Zheng, D., Liwinski, T., & Elinav, E. (2020). Interaction between microbiota and immunity in health and disease. Cell Res, 30(6), 492-506. https://doi.org/10.1038/s41422-020-0332-7

Toplam 65 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Klinik Tıp Bilimleri (Diğer)
Bölüm	Araştırma Makaleleri
Yazarlar	Özlem Akay 0000-0002-9539-7252 Gülfer Yakici 0000-0001-6486-3209
Yayımlanma Tarihi	29 Ağustos 2025
Gönderilme Tarihi	10 Şubat 2025
Kabul Tarihi	7 Mart 2025
Yayımlandığı Sayı	Yıl 2025 Cilt: 5 Sayı: 2

Kaynak Göster

APA	Akay, Ö., & Yakici, G. (2025). The Comparison of Machine Learning Algorithms for Microbiome Data. Fenerbahçe University Journal of Health Sciences, 5(2), 206-224. https://doi.org/10.56061/fbujohs.1636654

Kapak Resmi İndir

Makale Dosyaları

Tam Metin

Bu eser , Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 Uluslararası Lisansı altında lisanslanmıştır .