Artificial Intelligence-based Colon Cancer Prediction by Identifying Genomic Biomarkers
Abstract
Material and Methods: In the study, a dataset containing the expression levels of 2000 genes from 62 different samples (22 healthy and 40 tumor tissues) obtained by the Princeton University Gene Expression Project and shared in the figshare database was used. Data were summarized as mean ± standard deviation. Independent Samples T-Test was used for statistical analysis. The SMOTE method was applied before the feature selection to eliminate the class imbalance problem in the dataset. The 13 most important genes that may be associated with colon cancer were selected with the LASSO feature selection method. Random Forest (RF), Decision Tree (DT), and Gaussian Naive Bayes methods were used in the modeling phase.
Results: All 13 genes selected by LASSO had a statistically significant difference between normal and tumor samples. In the model created with RF, all the accuracy, specificity, f1-score, sensitivity, negative and positive predictive values were calculated as 1. The RF method offered the highest performance when compared to DT and Gaussian Naive Bayes.
Conclusion: In the study, we identified the genomic biomarkers of colon cancer and classified the disease with a high-performance model. According to our results, it can be recommended to use the LASSO+RF approach when modeling high-dimensional microarray data.
Keywords
Kaynakça
- 1. Globocan W. Estimated cancer incidence, mortality and prevalence worldwide in 2012. Int Agency Res Cancer. 2012.
- 2. Labianca R, Beretta G, Gatta G, De Braud F, Wils J. Colon cancer. Critical reviews in oncology/hematology. 2004;51(2):145-70.
- 3. Loboda A, Nebozhyn MV, Watters JW, Buser CA, Shaw PM, Huang PS, et al. EMT is the dominant program in human colon cancer. BMC medical genomics. 2011;4(1):1-10.
- 4. Xu C, Meng LB, Duan YC, Cheng YJ, Zhang CM, Zhou X, et al. Screening and identification of biomarkers for systemic sclerosis via microarray technology. International Journal of Molecular Medicine. 2019;44(5):1753-70.
- 5. Ahmad MA, Eckert C, Teredesai A, editors. Interpretable machine learning in healthcare. Proceedings of the 2018 ACM international conference on bioinformatics, computational biology, and health informatics; 2018.
- 6. YAĞIN FH, YAĞIN B, ARSLAN AK, ÇOLAK C. Comparison of Performances of Associative Classification Methods for Cervical Cancer Prediction: Observational Study. Turkiye Klinikleri Journal of Biostatistics. 2021;13(3).
- 7. Khaire UM, Dhanalakshmi R. High-dimensional microarray dataset classification using an improved adam optimizer (iAdam). Journal of Ambient Intelligence and Humanized Computing. 2020;11(11):5187-204.
- 8. Hameed SS, Hassan R, Hassan WH, Muhammadsharif FF, Latiff LA. HDG-select: A novel GUI based application for gene selection and classification in high dimensional datasets. PloS one. 2021;16(1):e0246039.
Ayrıntılar
Birincil Dil
İngilizce
Konular
Sağlık Kurumları Yönetimi
Bölüm
Araştırma Makalesi
Yayımlanma Tarihi
1 Mayıs 2022
Gönderilme Tarihi
22 Şubat 2022
Kabul Tarihi
28 Mart 2022
Yayımlandığı Sayı
Yıl 2022 Cilt: 4 Sayı: 2
Cited By
Exploring obesity, physical activity, and digital game addiction levels among adolescents: A study on machine learning-based prediction of digital game addiction
Frontiers in Psychology
https://doi.org/10.3389/fpsyg.2023.1097145Comparison of Feature Selection Methods in Breast Cancer Microarray Data
Medical Records
https://doi.org/10.37990/medr.1202671Development of Artificial Intelligence Based Clinical Decision Support System on Medical Images for the Classification of COVID-19
Medical Records
https://doi.org/10.37990/medr.1130194Advances in Genomic Data and Biomarkers: Revolutionizing NSCLC Diagnosis and Treatment
Cancers
https://doi.org/10.3390/cancers15133474Comparison of Electrocardiographic Parameters by Gender in Heart Failure Patients with Preserved Ejection Fraction via Artificial Intelligence
Diagnostics
https://doi.org/10.3390/diagnostics13203221Performance comparison machine learning algorithms in diabetes disease prediction
European Mechanical Science
https://doi.org/10.26701/ems.1335503A Fecal-Microbial-Extracellular-Vesicles-Based Metabolomics Machine Learning Framework and Biomarker Discovery for Predicting Colorectal Cancer Patients
Metabolites
https://doi.org/10.3390/metabo13050589Microarray Gene Expression Data Classification via Wilcoxon Sign Rank Sum and Novel Grey Wolf Optimized Ensemble Learning Models
IEEE/ACM Transactions on Computational Biology and Bioinformatics
https://doi.org/10.1109/TCBB.2023.3305429Machine learning approach for classification of prostate cancer based on clinical biomarkers
The Journal of Cognitive Systems
https://doi.org/10.52876/jcs.1221425Genomic Biomarkers of Metastasis in Breast Cancer Patients: A Machine Learning Approach
The Journal of Cognitive Systems
https://doi.org/10.52876/jcs.1211185Gene Expression-Based Cancer Classification for Handling the Class Imbalance Problem and Curse of Dimensionality
International Journal of Molecular Sciences
https://doi.org/10.3390/ijms25042102