Development of a New Supervised Principal Component Analysis Based on Artificial Neural Networks in Gene Expression Data
Öz
The aim of this study is dimension reduction of multidimensional gene expression data using supervised principal component analysis (S-PCA) and –proposed as a new approach- supervised principal component analysis with artificial neural networks (S-ANN-PCA) and to compare performances of these two methods by using random survival forests (RSF). In simulation application 5000 genes were generated according to multivariate normal distribution and then survival time that is correlated to these gene data were generated for 100 units. Simulation step was carried out with 1000 repetitions.
In addition, gene expression data for 240 individuals with extensive B-cell lymphoma (DLBCL) were used. Dimension reduction was done using Wald statistic in selection of important genes. The new data sets obtained from the methods were analyzed using RSF analysis.In the simulation application, it was obtained that the explanatoriness of S-PCA was significantly different from S-ANN-PCA (p<0.001). In the DLBCL data application, it was found that the error rate for the S-PCA was 36.78% and 43% for the S-ANN-PCA as a result of RSF. The importance value of S-PCA method was found to be higher and its error rate was found to be lower than the other method.S-PCA performed better than S-ANN-PCA in analyzing gene expression data experiencing a multidimensional problem.
Anahtar Kelimeler
Kaynakça
- 1. Rosenwald A, Wright G, Chan WC, Connors JM, Campo E, Fisher RI, et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. New England Journal of Medicine. 2002;346(25):1937-47.
- 2. Bair E, Tibshirani R. Semi-supervised methods to predict patient survival from gene expression data. PLoS biology. 2004;2(4):e108.
- 3. Chen X, Wang L, Smith JD, Zhang B. Supervised principal component analysis for gene set enrichment of microarray data with continuous or survival outcomes. Bioinformatics. 2008;24(21):2474-81.
- 4. Beer DG, Kardia SL, Huang C-C, Giordano TJ, Levin AM, Misek DE, et al. Gene-expression profiles predict survival of patients with lung adenocarcinoma. Nature medicine. 2002;8(8):816-24.
- 5. Kramer MA. Nonlinear principal component analysis using autoassociative neural networks. AIChE journal. 1991;37(2):233-43.
- 6. Hsieh WW. Machine learning methods in the environmental sciences: Neural networks and kernels: Cambridge university press; 2009.
- 7. Monahan AH. Nonlinear principal component analysis by neural networks: theory and application to the Lorenz system. Journal of Climate. 2000;13(4):821-35.
- 8. Scholz M, Fraunholz M, Selbig J. Nonlinear principal component analysis: neural network models and applications. Principal manifolds for data visualization and dimension reduction: Springer; 2008. p. 44-67.
Ayrıntılar
Birincil Dil
İngilizce
Konular
Sağlık Kurumları Yönetimi
Bölüm
Araştırma Makalesi
Yazarlar
Mevlüt Türe
*
ADNAN MENDERES ÜNİVERSİTESİ
Türkiye
İmran Kurt Ömürlü
ADNAN MENDERES ÜNİVERSİTESİ
Türkiye
Yayımlanma Tarihi
22 Şubat 2018
Gönderilme Tarihi
27 Aralık 2017
Kabul Tarihi
22 Şubat 2018
Yayımlandığı Sayı
Yıl 2018 Cilt: 40 Sayı: 1