DNA Microarray Gene Expression Data Classification Using SVM, MLP, and RF with Feature Selection Methods Relief and LASSO
Öz
DNA microarray technology is a novel method to monitor expression levels of large number of genes simultaneously. These gene expressions can be and is being used to detect various forms of diseases. Using multiple microarray datasets, this paper cross compares two different methods for classification and feature selection. Since individual gene count in microarray datas are too many, most informative genes should be selected and used. For this selection, we have tried Relief and LASSO feature selection methods. After selecting informative genes from microarray data, classification is performed with Support Vector Machines (SVM) and Multilayer Perceptron Networks (MLP) which both are widely used in multiple classification tasks. The overall accuracy with LASSO and SVM outperforms most of the approaches proposed.
Anahtar Kelimeler
Kaynakça
- [1] Schena, M., Shalon, D., Davis, R. W., & Brown, P. O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 270(5235), 467-470.
- [2] Alizadeh, Ash & B Eisen, Michael & Davis, Richard & Ma, Chi & S Lossos, Izidore & Rosenwald, Andreas & C Boldrick, Jennifer & Sabet, Hajeer & Tran, Truc & Yu, Xin. (2000). Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature. 403. 503-511.
- [3] Hira, Z. M., & Gillies, D. F. (2015). A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data. Advances in Bioinformatics, 2015, 198363.
- [4] Kira, K., & Rendell, L. A. (1992). A practical approach to feature selection. In Machine Learning Proceedings 1992 (pp. 249-256).
- [5] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 267-288.
- [6] Brown, M. P., Grundy, W. N., Lin, D., Cristianini, N., Sugnet, C., Ares, M., & Haussler, D. (1999). Support vector machine classification of microarray gene expression data. University of California, Santa Cruz, Technical Report UCSC-CRL-99-09.
- [7] Rafii, F., Kbir, M. H. A., & Hassani, B. D. R. (2015, November). MLP network for lung cancer presence prediction based on microarray data. In Complex Systems (WCCS), 2015 Third World Conference on (pp. 1-6). IEEE.
- [8] Díaz-Uriarte, R., & De Andres, S. A. (2006). Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7(1), 3.
Ayrıntılar
Birincil Dil
İngilizce
Konular
Mühendislik
Bölüm
Araştırma Makalesi
Yazarlar
Kıvanç Güçkıran
*
0000-0002-9501-2068
Türkiye
İsmail Cantürk
Bu kişi benim
0000-0003-0690-1873
Türkiye
Lale Özyılmaz
0000-0001-9720-9852
Türkiye
Yayımlanma Tarihi
1 Nisan 2019
Gönderilme Tarihi
14 Ağustos 2018
Kabul Tarihi
2 Nisan 2019
Yayımlandığı Sayı
Yıl 2019 Cilt: 23 Sayı: 1
Cited By
Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification
Medical & Biological Engineering & Computing
https://doi.org/10.1007/s11517-021-02476-xIncorporating Feature Selection Methods into Machine Learning-Based Covid-19 Diagnosis
Applied Computer Systems
https://doi.org/10.2478/acss-2022-0002A Modified Firefly Deep Ensemble for Microarray Data Classification
The Computer Journal
https://doi.org/10.1093/comjnl/bxac143Comparison of Feature Selection Methods in Breast Cancer Microarray Data
Medical Records
https://doi.org/10.37990/medr.1202671Artificial Intelligence-based Colon Cancer Prediction by Identifying Genomic Biomarkers
Medical Records
https://doi.org/10.37990/medr.1077024Hybrid feature selection model based on relief‐based algorithms and regulizer algorithms for cancer classification
Concurrency and Computation: Practice and Experience
https://doi.org/10.1002/cpe.6200Generalized Penalized Constrained Regression: Sharp Guarantees in High Dimensions with Noisy Features
Mathematics
https://doi.org/10.3390/math11173706Improved multi-layer hybrid adaptive particle swarm optimization based artificial bee colony for optimizing feature selection and classification of microarray data
Multimedia Tools and Applications
https://doi.org/10.1007/s11042-023-17234-4A New Approach for Multimodal Usage of Gene Expression and Its Image Representation for the Detection of Alzheimer’s Disease
Biomolecules
https://doi.org/10.3390/biom13111563Hybrid ANOVA and LASSO Methods for Feature Selection and Linear Support Vector, Multilayer Perceptron and Random Forest Classifiers Based on Spark Environment for Microarray Data Classification
IOP Conference Series: Materials Science and Engineering
https://doi.org/10.1088/1757-899X/1094/1/012107Prediction of Alzheimer’s Disease by a Novel Image-Based Representation of Gene Expression
Genes
https://doi.org/10.3390/genes13081406Identification of potential biomarkers with colorectal cancer based on bioinformatics analysis and machine learning
Mathematical Biosciences and Engineering
https://doi.org/10.3934/mbe.2021443An Efficient Approach to Microarray Data Classification using Elastic Net Feature Selection, SVM and RF
Journal of Physics: Conference Series
https://doi.org/10.1088/1742-6596/1911/1/012010Memory based cuckoo search algorithm for feature selection of gene expression dataset
Informatics in Medicine Unlocked
https://doi.org/10.1016/j.imu.2021.100572Gene reduction and machine learning algorithms for cancer classification based on microarray gene expression data: A comprehensive review
Expert Systems with Applications
https://doi.org/10.1016/j.eswa.2022.118946Transcriptome profiling by combined machine learning and statistical R analysis identifies TMEM236 as a potential novel diagnostic biomarker for colorectal cancer
Scientific Reports
https://doi.org/10.1038/s41598-021-92692-0AltWOA: Altruistic Whale Optimization Algorithm for feature selection on microarray datasets
Computers in Biology and Medicine
https://doi.org/10.1016/j.compbiomed.2022.105349Gene selection based on recursive spider wasp optimizer guided by marine predators algorithm
Neural Computing and Applications
https://doi.org/10.1007/s00521-024-09965-8Wavelet feature extraction and bio-inspired feature selection for the prognosis of lung cancer − A statistical framework analysis
Measurement
https://doi.org/10.1016/j.measurement.2024.115330Hybrid Causal Feature Selection for Cancer Biomarker Identification From RNA-Seq Data
IEEE/ACM Transactions on Computational Biology and Bioinformatics
https://doi.org/10.1109/TCBB.2024.3406922