TY - JOUR TT - The Success Of Logistic Regression With Feature Reduction Techniques On Microarray Gene Classification AU - Yengi, Yeliz AU - İlhan Omurca, Sevinç PY - 2016 DA - June JF - Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi JO - TBV-BBMD PB - Akademik Bilişim Vakfı WT - DergiPark SN - 1305-8991 SP - 1 EP - 12 VL - 8 IS - 1 KW - Gen analizi KW - makine öğrenmesi KW - lojistik regresyon KW - özellik azaltma KW - SVM N2 - DNA microarray classification is important todiscovery of differentially expressed genes betweennormal and diseased patients are a central researchproblem in bioinformatics. All the genes used in theexpression profile are not informative. Further, manyof them are redundant. A pre-processing step in orderto reduce the number of genes by feature selectionand still retaining best class prediction accuracy for the cla1ssifier is crucial for precise tumorclassification. In this study comparison between classprediction accuracy of two different classifiers, LR(Logistic Regression) and SVM (Support VectorMachines), was carried out using the best genesselect by wrapper and filter technique to use heuristicsearch methods. We conclude that LR together withheuristic search based feature selection is the asefficient as SVM to the microarray gene predictiontechniques. CR - Ben-Dor, A., Shamir, R., Yakhini, Z., 1999, Clustering gene expression patterns ,J Comput Biol, 6(3): 281–97. CR - Roberts, C.J., Nelson, B., Marton, M.J., Stoughton, R., Meyer, M.R., Bennett, H.A., 2000, Signaling and circuitry ofmultiple Mapk pathways revealed by a matrix of global gene expression profiles, Science, 287: 873–80. CR - Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., Yakhini, Z., 2000, Tissue classification with geneexpression profiles, In: Proceedings of the Fourth International Conference on Computational Molecular Biology. Tokyo: Universal Academic Press. CR - Alizadeh, A., Eisen, M.B., Davis, R.E., Ma C Lossos, I.S., Rosenwald, A., 2000, Distinct types of diffuse largeB-cell lymphoma identified by gene expression profiling, Nature, 403: 503–11. CR - Wang, X., Gotoh, O., 2010, A robust gene selection method for microarray-basedcancer classification, Cancer Inf, 9:15–30. CR - Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J. S., 2005, Incremental wrapper-based gene selection from microarray datafor cancer classification, Pattern Recognition, 39: 2383 – 2392. CR - Langley, P., 1994, Selection of relevant features in machine learning, In: Proceedings of the AAAI Fall Symposium on Relevance. CR - Kohavi, R., John,G., 1997, Wrappers for feature subset selection, Artif. Intell. 1–2: 273–324. CR - Alter, O., Brown, P.O., Botstein, D., 2000, Singular value decomposition for genomewide expression data processing and modeling, Proc. Natl. Acad. Sci., 97(18). CR - Cangelosi, R., Goriely, A., 2007, Component retention in principal component analysiswith application to cdna microarray data, Biol. Direct, 2:1–21. CR - Liu, K., Li, B., Wu,Q.Q., Zhang, J. , Du, J.X., Liu,G.Y., 2009, Microarray data classification based on ensemble independent component selection, Comput. Biol. Med., 39(11): 953–960. CR - Inza, I., Larranaga, P., Blanco, R., Cerrolaza, A., 2004, Filter versus wrappergene selection approaches in DNA microarray domains. Artif. Intell. Med., 31(2): 91–103. CR - Pohar, M., Blas, M., Turk, S., 2004, Comparison of Logistic Regression and Linear. Discriminant Analysis: A Simulation Study”, Metodološki zvezki, 1: 143-161. CR - Maroco, J., Silva, D., Rodrigues, A., Guerreiro, M., Santana, I., Mendonça, A., 2011, Data mining methods in the prediction of Dementia: A real-data comparison of the accuracy, sensitivity and specificity of linear discriminant analysis, logistic regression, neural networks, support vector machines, classification trees and random forests, BMC Research Notes, 4:299. CR - Hall, M.A., Smith, L.A., 1997, Feature subset selection: A Correlation Based Filter Approach, In International Conference on Neural Information Processing and Intelligent Information Systems. Berlin: Springer, 855-858. CR - Jackson, J., 1991, A users guide to principal components, Wiley & Sons, New York. CR - Loh, W., 2006, Logistic regression tree analysis, In Springer Handbook of Engineering Statistics, 537-551. CR - Breiman, L., Friedman, H., Olshen, J., Stone, C., 1984, Classification and Regression Trees, Belmont, CA: Wadsworth. CR - Le Cessie, S., Van Houwellingen, J.C., 1992, Ridge Estimators in Logistic Regression, University of Leiden, the Netherlands. Appl. Statist., 41(1): 191-201. CR - Liu, D., Ghosh, D., lin, X., 2008, Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models, BMC Bioinformatics. CR - Bartenhagen, C.,Klein, H.U., Ruckert, C., Jiang, X., Dugas, M., 2010, Comparative study of unsupervised dimension reduction techniques for the visualization of microarray gene expression data, BMC Bioinformatics, 11(567). CR - Kim , K.J., Cho , S.B., 2006, Ensemble classifiers based on correlation analysisfor DNA microarray classification, Neurocomputing, 70:187-199. CR - Nguyen, D.V., Rocke, D.M., 2002, Tumor classification by partial leastsquares using microarray gene expression data, Bioinformatics, 18: 39–50. CR - Cortes, C., Vapnik, V., 1995, Support-Vector Networks, Machine Learning, 20: 273-297. CR - Smith, L.I., 2002, A tutorial on Principal Components Analysis. CR - Dagliyan, O., Uney-Yuksektepe, F., Kavakli, I.H, Turkay, M., 2011, Optimization Based Tumor Classification from Microarray Gene Expression Data. CR - Vimaladevi, M., Kalaavathi, B., 2014, Cancer Classification using Hybrid Fast ParticleSwarm Optimization with BackpropagationNeural Network, International Journal of Advanced Research in Computer and Communication Engineering, 3(11). CR - Paulya, F., Smedbyc, K.E., Jerkemand, M., Hjalgrime, H., Ohlssonf, M., Rosenquist, R., Borrebaecka, C.A.K., Wingrena, C., 2014, Identification of B-cell lymphoma subsets by plasma protein profilingusing recombinant antibody microarrays, Leukemia Research, 38: 682–690. CR - Yan, Z., Li, J.Xiong, Y., Xu, W., Zheng, G., 2012, Identification of candidate colon cancer biomarkers by applying a random forest approach on microarray data, Oncology Reports, 28: 1036-1042. CR - Thorsteinsson, M., Kirkeby, L.T., Hansen, R., Lund L.R., Sørensen L.T., Gerds, T.A., Jess, P., Olsen, J., 2012, Gene expression profiles in stages II and III colon cancers:application of a 128-gene signature, Int J Colorectal Dis, 27: 1579–1586. CR - Bennet, J., Ganaprakasam, C.A., Arputharaj, K., 2014, A Discrete Wavelet Based Feature Extraction and HybridClassification Technique for Microarray Data Analysis, Hindawi Publishing Corporation The Scientific World Journal. CR - www.biomedcentral.com/1471-2105/12/390 /#B12 CR - www.cs.waikato.ac.nz/ml/weka/ UR - https://dergipark.org.tr/tr/pub/tbbmd/article/238832 L1 - https://dergipark.org.tr/tr/download/article-file/398991 ER -