Yıl 2020, Cilt 49 , Sayı 2, Sayfalar 869 - 886 2020-04-02

Visual research on the trustability of classical variable selection methods in Cox regression

Nihal ATA TUTKUN [1] , Yasemin KAYHAN ATILGAN [2]


Multivariate models such as the Cox regression model, if developed carefully, are powerful tools for making prognostic prediction which are frequently used in studies of clinical outcomes. Many applications require a large number of variables to be modelled by using a relatively small patient sample. Determination of the important variables in a model is critical to understand the behaviour of phenomena as the independent variables contribute the most to the outcome. From a practical perspective, a small subset of independent variables are usually selected from a large data set without the loss of any predictive efficiency. Automatic variable selection algorithms in scientific studies are commonly used for obtaining interpretable and practically applicable models. However, the careless use of these methods may lead to statistical problems. The performance of the generated models may be poor due to the violation of assumption, omission of the important variables, problems of overfitting, and the problem of multicollinearity and outliers. In order to enhance the accuracy of a model, it is essential to explore the data and its main characteristics before making any statistical inference. This study suggests an approach for acquiring a trustworthy model selection procedure for survival data by performing classical variables selection methods, accompanied by a graphical visualization method, namely robust coplot. Thus, it enables us to investigate the discrimination of observations, clusters of the variables and clusters of the observations that are highly characterized by a particular variable in a one graph. We present an application of combined method, as an integral part of statistical modelling, on survival data on multiple myeloma to show how coplot results are used in automatic variable selection algorithm in Cox regression model-building.
Cox regression model, graphical visualization, multidimensional scaling, robust coplot, variable selection
  • [1] H. Akaike, A new look at the statistical model identification, IEEE Transactions on Automatic Control AC 19, 716-723, 1974.
  • [2] N. Ata and M.T. Sozer, Cox regression models with nonproportional hazards applied to lung cancer survival data, Hacet. J. Math. Stat. 36 (2), 157-167, 2007.
  • [3] Y.K. Atilgan, Robust coplot analysis, Comm. Statist. Simulation Comput. 45 (5), 1763-1775, 2016.
  • [4] Y.K. Atilgan and E.L. Atilgan, RobCoP: A Matlab Package for Robust CoPlot Analysis, Open Journal of Statistics 7, 23-35, 2017.
  • [5] T. Bednarski, On sensitivity of Coxs estimator, Statistics and Decisions 7, 215-228, 1989.
  • [6] D.M. Bravata, K.G. Shojania, I. Oklin and A. Raveh A tool for visualizing multivariate data in medicine, Stat. Med. 27 (12), 2234-2247, 2007.
  • [7] D. Collett,Modeling Survival Data in Medical Research, 2nd Ed. New York: Chapman @ Hall/ CRS A CRC Press Company, 2003.
  • [8] D.R.Cox, Regression Models and Life Tables, J. R. Stat. Soc. Ser. B. Stat. Methodol. 34 (2), 187-220, 1972.
  • [9] S. Derksen and H.J. Keselman, Backward, forward and stepwise automated subset selection algorithms: Frequency of obtaining authentic and noise variables, Brit. J. Math. Stat. Psy. 45 (2), 265-282, 1992.
  • [10] J. Fan and R. Li, Variable selection for Cox’s proportional hazards model and frailty model, Ann. Statist. 3, 74-99, 2002.
  • [11] D. Faraggi and R. Simon, Bayesian variable selection method for censored survival data, Biometrics 54, 1475-1485, 1998.
  • [12] P.A. Forero and G.B. Giannakis, Robust multi-dimensional scaling via outlier sparsity control, Robust multi-dimensional scaling via outlier sparsity control, 1183-1187, 2011.
  • [13] Jr F. Harrell and K.L. Lee, Regression Modelling Strategies for Improved Prognostic Prediction, Stat. Med. 3, 143-152, 1984.
  • [14] G. Heinze, C. Wallisch and D. Dunkler, Variable selection - A review and recommendations for the practicing statistician, Biom J. 60 (3), 431-449, 2018.
  • [15] M.H. Katz,Multivariable Analysis: A Practical Guide for Clinicians and Public Health Researchers, Third Edition, Cambridge University Press, New York, 2011.
  • [16] J.M. Krall, V.A. Uthoff and J.B. Harley, A step-up procedure for selecting variables associated with survival, Biometrics 31, 49-57, 1975.
  • [17] H. Liang, and G. Zou, Improved AIC selection strategy for survival analysis, Comput. Statist. Data Anal. 52 (5), 2538-2548, 2008.
  • [18] A. Nardi and M. Schemper, New residuals for Cox regression and their application to outlierscreening, Biometrics 55, 523-529, 1999.
  • [19] C.L. Mallows Nardi and M. Schemper, Some comments on Cp, Technometrics 15, 661-675, 1973.
  • [20] N. Mantel, Why stepdown procedures in variable selection, Technometrics 12, 621-625, 1970.
  • [21] P.J. Rousseeuw and A.M. Leroy, Robust Regression and Outlier Detection, New York: Wiley Interscience, 1987.
  • [22] K.L. Sainani, Multivariate regression: The pitfalls of automated variable selection, Am. J. Phys. Med. Rehabil. 5, 791-794, 2013.
  • [23] G. Schwarz, Estimating the dimension of a model, Ann. Statist. 6, 461-464, 1978.
  • [24] G. Shevlyakov and P. Smirnov, Robust estimation of the correlation coefficient: an attempt of survey, Austrian J. Stat. 40, 147-156, 2011.
  • [25] R. Tibshirani, The lasso method for variable selection in the Cox model, Stat. Med. 16, 385-395, 1997.
  • [26] C.T. Volinsky and A.E. Raftery, Bayesian information criterion for censored survival models, Biometrics 56, 256-262, 2000.
Birincil Dil en
Konular İstatistik ve Olasılık
Bölüm İstatistik
Yazarlar

Orcid: 0000-0001-5204-680X
Yazar: Nihal ATA TUTKUN (Sorumlu Yazar)
Kurum: HACETTEPE UNIVERSITY
Ülke: Turkey


Orcid: 0000-0002-2612-7216
Yazar: Yasemin KAYHAN ATILGAN
Kurum: HACETTEPE UNIVERSITY
Ülke: Turkey


Tarihler

Yayımlanma Tarihi : 2 Nisan 2020

Bibtex @araştırma makalesi { hujms630402, journal = {Hacettepe Journal of Mathematics and Statistics}, issn = {2651-477X}, eissn = {2651-477X}, address = {}, publisher = {Hacettepe Üniversitesi}, year = {2020}, volume = {49}, pages = {869 - 886}, doi = {10.15672/hujms.630402}, title = {Visual research on the trustability of classical variable selection methods in Cox regression}, key = {cite}, author = {ATA TUTKUN, Nihal and KAYHAN ATILGAN, Yasemin} }
APA ATA TUTKUN, N , KAYHAN ATILGAN, Y . (2020). Visual research on the trustability of classical variable selection methods in Cox regression. Hacettepe Journal of Mathematics and Statistics , 49 (2) , 869-886 . DOI: 10.15672/hujms.630402
MLA ATA TUTKUN, N , KAYHAN ATILGAN, Y . "Visual research on the trustability of classical variable selection methods in Cox regression". Hacettepe Journal of Mathematics and Statistics 49 (2020 ): 869-886 <https://dergipark.org.tr/tr/pub/hujms/issue/53568/630402>
Chicago ATA TUTKUN, N , KAYHAN ATILGAN, Y . "Visual research on the trustability of classical variable selection methods in Cox regression". Hacettepe Journal of Mathematics and Statistics 49 (2020 ): 869-886
RIS TY - JOUR T1 - Visual research on the trustability of classical variable selection methods in Cox regression AU - Nihal ATA TUTKUN , Yasemin KAYHAN ATILGAN Y1 - 2020 PY - 2020 N1 - doi: 10.15672/hujms.630402 DO - 10.15672/hujms.630402 T2 - Hacettepe Journal of Mathematics and Statistics JF - Journal JO - JOR SP - 869 EP - 886 VL - 49 IS - 2 SN - 2651-477X-2651-477X M3 - doi: 10.15672/hujms.630402 UR - https://doi.org/10.15672/hujms.630402 Y2 - 2020 ER -
EndNote %0 Hacettepe Journal of Mathematics and Statistics Visual research on the trustability of classical variable selection methods in Cox regression %A Nihal ATA TUTKUN , Yasemin KAYHAN ATILGAN %T Visual research on the trustability of classical variable selection methods in Cox regression %D 2020 %J Hacettepe Journal of Mathematics and Statistics %P 2651-477X-2651-477X %V 49 %N 2 %R doi: 10.15672/hujms.630402 %U 10.15672/hujms.630402
ISNAD ATA TUTKUN, Nihal , KAYHAN ATILGAN, Yasemin . "Visual research on the trustability of classical variable selection methods in Cox regression". Hacettepe Journal of Mathematics and Statistics 49 / 2 (Nisan 2020): 869-886 . https://doi.org/10.15672/hujms.630402
AMA ATA TUTKUN N , KAYHAN ATILGAN Y . Visual research on the trustability of classical variable selection methods in Cox regression. Hacettepe Journal of Mathematics and Statistics. 2020; 49(2): 869-886.
Vancouver ATA TUTKUN N , KAYHAN ATILGAN Y . Visual research on the trustability of classical variable selection methods in Cox regression. Hacettepe Journal of Mathematics and Statistics. 2020; 49(2): 886-869.