Research Article
BibTex RIS Cite

Mikrobiyota Verileri İçin Boyut İndirgemede Yeni Bir Yaklaşım

Year 2021, Volume: 4 Issue: 1, 23 - 30, 15.01.2021

Abstract

İnsan derisi, nazofaringeal ve ağız boşlukları, vajinal sistem ve gastrointestinal sistem ile ilişkili mikroorganizmalar insan mikrobiyotasını oluşturur. Fizyolojik, metabolik ve immun sistem üzerinde oldukça etkilidir ve birçok hastalık ile ilişkisi gösterilmiştir. DNA dizileme teknolojisindeki son gelişmeler, bakteriler için 16S rRNA, 18s rRNA veya ITS gibi marker genlerinin amplikonlarının yüksek verim dizilimi yoluyla, mikrobiyal toplulukların profillenmesi kolaylaşmıştır. Elde edilen veriler, çok büyük sayılarda mikrobiyota türlerine ait frekans değerlerinden oluşur ve bol miktarda sıfır değeri içerir. Mikrobiyota verileri gibi büyük boyutlu verilerin çeşitli istatistik modellerle analiz edilebilmesi için ön işleme aşamasında, sonuca anlamlı katkısı bulunmayan türlerin veri analizinden çıkarılması gerekmektedir. İstatistik literatüründe bu işlem, boyut indirgeme veya değişken eleme olarak adlandırılmaktadır.
Bu çalışmada, çok sayıda sıfır değeri içeren frekans tipi büyük boyutlu veri setlerinde, boyut indirgeme amacıyla kullanılabilecek yeni bir yaklaşım önerildi. Bu amaçla, tek değişkenli testler, sıfır etkili negatif binomiyal model, sınıflama ve regresyon ağaçları ve değişken seçimi algoritması kullanıldı.
Önerilen yaklaşım, Parkinson hastaları, erken demans ve kontrol bireylerinden elde edilen mikrobiyota cinsleri üzerinde denendi. Değişken seçimi sonucunda 199 bakteri cinsi içinden seçilen 19 adet aday cinsin, klinik açıdan da birçok çalışmada vurgulanan bakteri cinsleri olduğu görüldü. Aday olarak seçilen cinslerin hastalık tanısındaki başarısını değerlendirmek için kurulan multiple logistic regresyon modelinde yeniden stepwise değişken eleme yöntemi kullanıldı ve bu model sonucunda birkaç bakteri cinsi ile başarılı bir şekilde hasta ve kontrol gruplarının ayrımı yapıldı.
Bu çalışma ile önerilen yeni hibrit yaklaşım, birden çok yöntemin ortak kararı neticesinde belirlenen değişkenleri veri analizine alma imkanı sunmaktadır. Benzeri yaklaşımlar farklı yöntemlerle denenerek farklı veri tipleri üzerinde kullanılabilir.

References

  • Altuntaş Y, Batman A. “Mikrobiyota ve metabolik sendrom”. Turk Kardiyol Dern Ars , 45(3), 286–296, 2017.
  • Chen WP, Chang SH, Tang CY, Liou ML, Tsai SJ, Lin YL. “Composition analysis and feature selection of the oral microbiota associated with periodontal disease”. Biomed Res Int, 2018, 1-14, 2018.
  • Saeys Y, Inza I, Larra˜naga P. “A review of feature selection techniques in bioinformatics”. Bioinformatics, 23(19),2507–2517, 2007.
  • Knights D, Costello EK, Knight R. “Supervised classification of human microbiota”. FEMS Microbiol Rev, 35(2), 343–359, 2011.
  • Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, Huttenhower C. “Metagenomic biomarker discovery and explanation”. Genome Biol., 12(6), 1-18, 2011.
  • Ditzler G, Morrison JC, Lan Y, Rosen GL. “Fizzy: feature subset selection for metagenomics”. BMC Bioinformatics, 16(358), 1-8, 2015.
  • Torbati ME, Mitreva M, Gopalakrishnan V. “Application of taxonomic modeling to microbiota data mining for detection of Helminth infection in global populations”. Data (Basel), 1(3), 1-23, 2016.
  • Zhang B, Cao P. “Classification of high dimensional biomedical data based on feature selection using redundant removal”. PLoS ONE, 14(4), 1-19, 2019.
  • Mahadeo U, Dhanalakshmi KR. “Stability of feature selection algorithm: A review”. Journal of King Saud University –Computer and Information Sciences, Article in Press, 1-14, 2019. https://doi.org/10.1016/j.jksuci.2019.06.012
  • Somol P, Baesens B, Pudil P Vanthienen J, Leuven KU. “Filter- versus Wrapper-based feature selection for credit scoring”. Int J Intell Syst, 20(10), 985-99, 2005.
  • Oudah M, Henschel A. “Taxonomy-aware feature engineering for microbiome classification”. Bioinformatics, 19(227), 1-13, 2018.
  • Haikal C, Chen QQ, Li JY. “Microbiome changes: an indicator of Parkinson’s disease?”. Transl Neurodegener, 8(38),1-9, 2019.

A New Approach to Dimension Reduction for Microbiota Data

Year 2021, Volume: 4 Issue: 1, 23 - 30, 15.01.2021

Abstract

Microorganisms associated with human skin, nasopharyngeal and oral cavities, vaginal tract, and gastrointestinal system make up the human microbiota. It is highly effective on the physiological, metabolic and immune system and has been shown to be associated with many diseases. Recent advances in DNA sequencing technology have facilitated profiling of these microbial communities through high throughput sequencing of amplicons of the marker genes such as 16S rRNA for bacteria, 18S rRNA or ITS. Data generated from such sequencing efforts are preprocessed into composition or relative abundance that are often presented in species abundance (OTU/ASV) tables. The data obtained consists of the frequency of microbiota species in very large numbers and it contains a large amount of zero values. Nonetheless, the high dimensional data in such tables must be treated with dimension reduction techniques to draw sensible conclusions from the data. In the statistical literature, this process is called dimension reduction or variable selection.
The aim in this study is to propose a novel approach to reduce dimensions in high dimensional and inherently zero inflated and frequency character microbiota data. For this purpose, univariate tests, a zero-inflated negative binomial model, classification and regression trees, and a feature selection and variable screening algorithm were used. Using these four methods enabled us to select most important features of the microbiota dataset for the subsequent downstream analyses.
We tested the above approach on our recent microbiota dataset we generated from stool samples of Parkinson’s disease patients cohort. Of 199 bacteria genera our approach enabled us to select 19 candidate biomarker genera, which are often implicated in serving critical metabolic activities in human body such as production of short-chain fatty acids. To assess the potential of these candidate biomarkers in differentiating disease and healthy states we developed a multiple logistic regression model and further selected their biomarker potential in a stepwise variable screening.
Big data analysis necessarily entails use of increasingly more sophisticated and combinatorial modalities. Here we successfully demonstrated that hitherto untested combinatorial use of feature selection methods enables more useful predictive models. Similar approaches can be tried with different methods and used on different data types.

References

  • Altuntaş Y, Batman A. “Mikrobiyota ve metabolik sendrom”. Turk Kardiyol Dern Ars , 45(3), 286–296, 2017.
  • Chen WP, Chang SH, Tang CY, Liou ML, Tsai SJ, Lin YL. “Composition analysis and feature selection of the oral microbiota associated with periodontal disease”. Biomed Res Int, 2018, 1-14, 2018.
  • Saeys Y, Inza I, Larra˜naga P. “A review of feature selection techniques in bioinformatics”. Bioinformatics, 23(19),2507–2517, 2007.
  • Knights D, Costello EK, Knight R. “Supervised classification of human microbiota”. FEMS Microbiol Rev, 35(2), 343–359, 2011.
  • Segata N, Izard J, Waldron L, Gevers D, Miropolsky L, Garrett WS, Huttenhower C. “Metagenomic biomarker discovery and explanation”. Genome Biol., 12(6), 1-18, 2011.
  • Ditzler G, Morrison JC, Lan Y, Rosen GL. “Fizzy: feature subset selection for metagenomics”. BMC Bioinformatics, 16(358), 1-8, 2015.
  • Torbati ME, Mitreva M, Gopalakrishnan V. “Application of taxonomic modeling to microbiota data mining for detection of Helminth infection in global populations”. Data (Basel), 1(3), 1-23, 2016.
  • Zhang B, Cao P. “Classification of high dimensional biomedical data based on feature selection using redundant removal”. PLoS ONE, 14(4), 1-19, 2019.
  • Mahadeo U, Dhanalakshmi KR. “Stability of feature selection algorithm: A review”. Journal of King Saud University –Computer and Information Sciences, Article in Press, 1-14, 2019. https://doi.org/10.1016/j.jksuci.2019.06.012
  • Somol P, Baesens B, Pudil P Vanthienen J, Leuven KU. “Filter- versus Wrapper-based feature selection for credit scoring”. Int J Intell Syst, 20(10), 985-99, 2005.
  • Oudah M, Henschel A. “Taxonomy-aware feature engineering for microbiome classification”. Bioinformatics, 19(227), 1-13, 2018.
  • Haikal C, Chen QQ, Li JY. “Microbiome changes: an indicator of Parkinson’s disease?”. Transl Neurodegener, 8(38),1-9, 2019.
There are 12 citations in total.

Details

Primary Language Turkish
Journal Section Articles
Authors

Handan Ankaralı 0000-0002-3613-0523

Süleyman Yıldırım This is me 0000-0002-2752-1223

Nurgül Bulut 0000-0002-7247-6302

Publication Date January 15, 2021
Published in Issue Year 2021 Volume: 4 Issue: 1

Cite

APA Ankaralı, H., Yıldırım, S., & Bulut, N. (2021). Mikrobiyota Verileri İçin Boyut İndirgemede Yeni Bir Yaklaşım. Veri Bilimi, 4(1), 23-30.



Dergimizin Tarandığı Dizinler (İndeksler)


Academic Resource Index

logo.png

journalseeker.researchbib.com

Google Scholar

scholar_logo_64dp.png

ASOS Index

asos-index.png

Rooting Index

logo.png

www.rootindexing.com

The JournalTOCs Index

journal-tocs-logo.jpg?w=584

www.journaltocs.ac.uk

General Impact Factor (GIF) Index

images?q=tbn%3AANd9GcQ0CrEQm4bHBnwh4XJv9I3ZCdHgQarj_qLyPTkGpeoRRmNh10eC

generalif.com

Directory of Research Journals Indexing

DRJI_Logo.jpg

olddrji.lbp.world/indexedJournals.aspx

I2OR Index

8c492a0a466f9b2cd59ec89595639a5c?AccessKeyId=245B99561176BAE11FEB&disposition=0&alloworigin=1

http://www.i2or.com/8.html



logo.png