Research Article

Robust regression estimation and variable selection when cellwise and casewise outliers are present

Volume: 50 Number: 1 February 4, 2021
EN

Robust regression estimation and variable selection when cellwise and casewise outliers are present

Abstract

Two main issues regarding a regression analysis are estimation and variable selection in presence of outliers. Popular robust regression estimation methods are combined with variable selection methods to simultaneously achieve robust estimation and variable selection. However, recent works showed that the robust estimation methods used in those estimation and variable selection procedures are only resistant to the casewise (rowwise) outliers in the data. Therefore, since these robust variable selection methods may not be able to cope with cellwise outliers in the data, some extra care should be taken when cellwise outliers are present along with the casewise outliers. In this study, we proposed a robust estimation and variable selection method to deal with both cellwise and casewise outliers in the data. The proposed method has three steps. In the first step, cellwise outliers were identified, deleted and marked with NA sign in each explanatory variable. In the second step, the cells with NA signs were imputed using a robust imputation method. In the last step, robust regression estimation methods were combined with the variable selection method LASSO (Least Angle Solution and Selection Operator) to estimate the regression parameters and to select remarkable explanatory variables. The simulation results and real data example revealed that the proposed estimation and variable selection procedure perform well in the presence of cellwise and casewise outliers.

Keywords

References

  1. [1] C. Agostinelli, A. Leung, V.J. Yohai and R.H. Zamar, Robust estimation of multivariate location and scatter in the presence of cellwise and casewise contamination, Test, 24 (3), 441-461, 2015.
  2. [2] F. Alqallaf, S. Van Aelst, V.J. Yohai and R.H. Zamar, Propagation of Outliers in Multivariate Data, Ann. Statist. 37 (1), 311-331, 2009.
  3. [3] O. Arslan, Weighted LAD-LASSO method for robust parameter estimation and variable selection in regression, Comput. Statist. Data Anal. 56 (6), 1952-1965, 2012.
  4. [4] O. Arslan, Penalized MM regression estimation with L γ penalty: a robust version of bridge regression, Statistics 50 (6), 1236-1260, 2016.
  5. [5] K.V. Branden and S. Verboven, Robust data imputation, Comput. Biol. Chem. 33 (1), 7-13, 2009.
  6. [6] M. Danilov, Robust estimation of multivariate scatter in non-affine equivariant scenarios, University of British Columbia, 2010.
  7. [7] M. Debruyne, S. Höppner, S. Serneels and T. Verdonck, Outlyingness: Which variables contribute most?, Stat. Comput. 29 (4), 707-723, 2019.
  8. [8] J. Fan, Y. Fan and E. Barut, Adaptive robust variable selection, Ann. Statist. 42 (1), 324-351, 2014.

Details

Primary Language

English

Subjects

Statistics

Journal Section

Research Article

Publication Date

February 4, 2021

Submission Date

May 8, 2020

Acceptance Date

November 23, 2020

Published in Issue

Year 2021 Volume: 50 Number: 1

APA
Toka, O., Çetin, M., & Arslan, O. (2021). Robust regression estimation and variable selection when cellwise and casewise outliers are present. Hacettepe Journal of Mathematics and Statistics, 50(1), 289-303. https://doi.org/10.15672/hujms.734212
AMA
1.Toka O, Çetin M, Arslan O. Robust regression estimation and variable selection when cellwise and casewise outliers are present. Hacettepe Journal of Mathematics and Statistics. 2021;50(1):289-303. doi:10.15672/hujms.734212
Chicago
Toka, Onur, Meral Çetin, and Olcay Arslan. 2021. “Robust Regression Estimation and Variable Selection When Cellwise and Casewise Outliers Are Present”. Hacettepe Journal of Mathematics and Statistics 50 (1): 289-303. https://doi.org/10.15672/hujms.734212.
EndNote
Toka O, Çetin M, Arslan O (February 1, 2021) Robust regression estimation and variable selection when cellwise and casewise outliers are present. Hacettepe Journal of Mathematics and Statistics 50 1 289–303.
IEEE
[1]O. Toka, M. Çetin, and O. Arslan, “Robust regression estimation and variable selection when cellwise and casewise outliers are present”, Hacettepe Journal of Mathematics and Statistics, vol. 50, no. 1, pp. 289–303, Feb. 2021, doi: 10.15672/hujms.734212.
ISNAD
Toka, Onur - Çetin, Meral - Arslan, Olcay. “Robust Regression Estimation and Variable Selection When Cellwise and Casewise Outliers Are Present”. Hacettepe Journal of Mathematics and Statistics 50/1 (February 1, 2021): 289-303. https://doi.org/10.15672/hujms.734212.
JAMA
1.Toka O, Çetin M, Arslan O. Robust regression estimation and variable selection when cellwise and casewise outliers are present. Hacettepe Journal of Mathematics and Statistics. 2021;50:289–303.
MLA
Toka, Onur, et al. “Robust Regression Estimation and Variable Selection When Cellwise and Casewise Outliers Are Present”. Hacettepe Journal of Mathematics and Statistics, vol. 50, no. 1, Feb. 2021, pp. 289-03, doi:10.15672/hujms.734212.
Vancouver
1.Onur Toka, Meral Çetin, Olcay Arslan. Robust regression estimation and variable selection when cellwise and casewise outliers are present. Hacettepe Journal of Mathematics and Statistics. 2021 Feb. 1;50(1):289-303. doi:10.15672/hujms.734212

Cited By