Research Article

A New Hybrid Regression Model for Undersized Sample Problem

Volume: 13 Number: 3 September 30, 2017
EN TR

A New Hybrid Regression Model for Undersized Sample Problem

Abstract

In traditional statistics, it is assumed that the number of samples which are available for study is more than number of well selected variables. Nowadays, in many fields, while the number of samples expressed in tens or hundreds, the single observation may have thousands even millions dimensions. The classical statistical techniques are not designed to be able to cope with this kind of data sets. Many of multivariate statistical techniques such as principal component analysis, factor analysis, classifiation and cluster analysis and the prediction of regression coefficients need estimation of the sample variance-covariance matrix or its inverse. When the number of observations is much smaller than the number of features (or variables), the usual sample covariance matrix degenerates and it can not be inverted. This is one of the biggest encountered obstacle to the classical statistical methods. To remedy the manifestation of the singular covariance matrices in high dimensional data, Hybrid Covariance Estimators (HCE) has been developed by Pamukcu et al.(2015). HCE has overcome the singularity problem of the covariance matrix and, thus, the multivariate statistical analysis for high dimensional data sets has been made possible. One of the most important process in statistical analysis using HCE is to select the appropriate covariance structure for the data set since HCE can in fact be obtained with many different covariance structures. It can be selected by using the information criteria such as Akaike Information Criteria, Information Complexity Criteria which are well known as model selection criteria.  In this study, we introduce a new regression model with HCE and information criteria for n<<p undersized high dimensional data. We demonstrate our approach on simulation studies with different scenarious for p/n ratios. We use AIC,CAIC and ICOMP criteria to select appropriate HCE structure and compare the results with classical regression analysis.

Keywords

References

  1. 1. Donoho, D.L.; High dimensional data analysis: The curses and blessings of dimensionality. statweb.stanford.edu/~donoho/Lectures/AMS2000/Curses.pdf. 2000
  2. 2. Cunningham, P.; Dimension Reduction. Technical Re-port.UCD-CSI-2007-7. University College Dublin. 2007
  3. 3. Fiebig, D.G.; On the maximum entropy approach to undersized samples. Applied Mathematics and Computation. 1984; 14, 301-312
  4. 4. Stein, C.; Estimation of covariance matrix. Rietz Lecture. 39th Annual Meeting IMS. Atlanta, Georgia. 1975.
  5. 5. Chen, Y.; Robust shrinkage estimation of high dimensional covariance matrices. IEEE Workshop on Sensor Array and Mul-tichannel Signal Processing (SAM). 2010
  6. 6. Ledoit, O. ; Wolf, M. A well conditioned estimator for large dimensional covariance matrices. Journal of Multivariate Analysis. 2004; 88, 365-411
  7. 7. Pamukçu, E.; Bozdogan, H., Çalık, S. A Novel Hybrid Dimen-sion Reduction Technique for Undersized High Dimensional Gene Expression Data Sets Using Information Complexity Criterion for Cancer Classification. Computational and Mathematical Methods in Medicine. Volume 2015 (2015), Article ID 370640, 14 pages
  8. 8. Erbaş, Ü.; Entropi İlkelerinin Boyut İndirgeme Uygulamaları. Doktora tezi. Marmara Üniversitesi Sosyal Bilimler Enstitüsü. İstanbul. 2010

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

September 30, 2017

Submission Date

September 22, 2017

Acceptance Date

May 29, 2017

Published in Issue

Year 2017 Volume: 13 Number: 3

APA
Pamukçu, E. (2017). A New Hybrid Regression Model for Undersized Sample Problem. Celal Bayar University Journal of Science, 13(3), 803-813. https://doi.org/10.18466/cbayarfbe.339536
AMA
1.Pamukçu E. A New Hybrid Regression Model for Undersized Sample Problem. CBUJOS. 2017;13(3):803-813. doi:10.18466/cbayarfbe.339536
Chicago
Pamukçu, Esra. 2017. “A New Hybrid Regression Model for Undersized Sample Problem”. Celal Bayar University Journal of Science 13 (3): 803-13. https://doi.org/10.18466/cbayarfbe.339536.
EndNote
Pamukçu E (September 1, 2017) A New Hybrid Regression Model for Undersized Sample Problem. Celal Bayar University Journal of Science 13 3 803–813.
IEEE
[1]E. Pamukçu, “A New Hybrid Regression Model for Undersized Sample Problem”, CBUJOS, vol. 13, no. 3, pp. 803–813, Sept. 2017, doi: 10.18466/cbayarfbe.339536.
ISNAD
Pamukçu, Esra. “A New Hybrid Regression Model for Undersized Sample Problem”. Celal Bayar University Journal of Science 13/3 (September 1, 2017): 803-813. https://doi.org/10.18466/cbayarfbe.339536.
JAMA
1.Pamukçu E. A New Hybrid Regression Model for Undersized Sample Problem. CBUJOS. 2017;13:803–813.
MLA
Pamukçu, Esra. “A New Hybrid Regression Model for Undersized Sample Problem”. Celal Bayar University Journal of Science, vol. 13, no. 3, Sept. 2017, pp. 803-1, doi:10.18466/cbayarfbe.339536.
Vancouver
1.Esra Pamukçu. A New Hybrid Regression Model for Undersized Sample Problem. CBUJOS. 2017 Sep. 1;13(3):803-1. doi:10.18466/cbayarfbe.339536