Year 2025,
Volume: 54 Issue: 6, 2350 - 2362, 30.12.2025
Yunlu Jıang
,
Huijie Lu
,
Xiaowen Huang
,
Ruizhe Jiang
Project Number
This work is supported by the National Science Foundation of China (12171203), and the Fundamental Research Funds for the Central Universities (23JNQMX21)
References
-
[1] A. Alfons, C. Croux and S. Gelper, Sparse least trimmed squares regression for analyzing
high-dimensional large data sets, Ann. Appl. Stat., 226-248, 2013.
-
[2] O. Arslan, Weighted LAD-LASSO method for robust parameter estimation and variable
selection in regression, Comput. Stat. Data. An. 56 (6), 1952-1965, 2012.
-
[3] K. Boudt, P.J. Rousseeuw, S. Vanduffel and T. Verdonck, The minimum regularized
covariance determinant estimator, Stat. Comput. 30 (1), 113-128, 2020.
-
[4] J. Bradic, J. Fan and W. Wang, Penalized composite quasi-likelihood for ultrahigh
dimensional variable selection, J. R. Stat. Soc. Ser. B Stat. Methodol. 73 (3), 325-
349, 2011.
-
[5] P. Bühlmann, M. Kalisch and M.H. Maathuis, Variable selection in high-dimensional
linear models: partially faithful distributions and the PC-simple algorithm, Biometrika
97 (2), 261-278, 2010.
-
[6] X. Chen, Z.J. Wang and M.J. McKeown, Asymptotic analysis of robust LASSOs in
the presence of noise with large variance, IEEE Trans. Inf. Theory 56 (10), 5131-5149,
2010.
-
[7] H. Cho and P. Fryzlewicz, High dimensional variable selection via tilting, J. R. Stat.
Soc. Ser. B Stat. Methodol. 74 (3), 593-622, 2012.
-
[8] J. Fan, Y. Fan and E. Barut, Adaptive robust variable selection, Ann. Stat. 42 (1),
324, 2014.
-
[9] J. Fan and R. Li, Nonconcave penalized likelihood with NP-dimensionality, IEEE.
Trans. Inf. Theory. 57 (8), 5467-5484, 2011.
-
[10] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle
properties, J. Amer. Stat. Assoc. 96 (456), 1348-1360, 2001.
-
[11] Y. Jiang, Y. Wang, J. Jiang, B. Xie, J. Liao and W. Liao, Outlier detection and robust
variable selection via the penalized weighted LAD-LASSO method, J. Appl. Stat. 48
(2), 234-246, 2021.
-
[12] Y. Jiang, Y.G. Wang, L. Fu and X. Wang, Robust estimation using modified Huber’s
functions with new tails, Technometrics 61 (1), 111-122, 2019.
-
[13] B.A. Johnson and L. Peng, Rank-based variable selection, J. Nonparam. Stat. 20 (3),
241-252, 2008.
-
[14] R.J. Karunamuni, L. Kong and W. Tu, Efficient robust doubly adaptive regularized
regression with applications, Stat. Methods Med. Res. 28 (7), 2210-2226, 2019.
-
[15] C. Leng, Variable selection and coefficient estimation via regularized rank regression,
Stat. Sin., 167-181, 2010.
-
[16] N. Li, Efficient sparse portfolios based on composite quantile regression for highdimensional
index tracking, J. Stat. Comput. Simul. 90 (8), 1466-1478, 2020.
-
[17] G. Li, H. Peng and L. Zhu, Nonconcave penalized M-estimation with a diverging
number of parameters, Stat. Sin., 391-419, 2011.
-
[18] P. Rousseeuw and K. Van Driessen, A fast algorithm for the minimum covariance
determinant estimator, Technometrics 42 (3), 212-223, 1999.
-
[19] V. Ročková and E.I. George, The spike-and-slab lasso, J. Am. Stat. Assoc. 113 (512),
431-444, 2018.
-
[20] R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser.
B Stat. Methodol. 58 (1), 267-288, 1996.
-
[21] X. Wang, Y. Jiang, M. Huang and H. Zhang, Robust variable selection with exponential
squared loss, J. Am. Stat. Assoc. 108 (502), 632-643, 2013.
-
[22] H. Wang, G. Li and G. Jiang, Robust regression shrinkage and consistent variable
selection through the LAD-Lasso, J. Bus. Econ. Stat. 25 (3), 347-355, 2007.
-
[23] L. Wang and R. Li, Weighted Wilcoxon-type smoothly clipped absolute deviation
method, Biometrics 65 (2), 564-571, 2009.
-
[24] Y. Wu and Y. Liu, Variable selection in quantile regression, Stat. Sin., 801-817, 2009.
-
[25] C. Wen, X. Wang and S. Wang, Laplace error penalty-based variable selection in high
dimension, Scand. J. Stat., 42 (3), 685-700, 2015.
-
[26] Y. Xue, J. Ren and B. Yang, Enmsp: an elastic-net multi-step screening procedure
for high-dimensional regression, Stat. Comput. 34 (2), 79, 2024.
-
[27] M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables,
J. R. Stat. Soc. Ser. B Stat. Methodol. 68 (1), 49-67, 2006.
-
[28] Z. Yang, L. Fu, Y.G. Wang, Z. Dong and Y. Jiang, A robust and efficient variable
selection method for linear regression, J. App. Stat. 49 (14), 3677-3692, 2022.
-
[29] C.H. Zhang, Nearly unbiased variable selection under minimax concave penalty, Ann.
Stat. 38 (2), 894-942, 2010.
-
[30] T. Zhang, Analysis of multi-stage convex relaxation for sparse regularization, J. Mach.
Learn. Res. 11 (3), 2010.
-
[31] J. Zhu, X. Wang, L. Hu, J. Huang, K. Jiang, Y. Zhang, S. Lin and J. Zhu, abess: A
fast best-subset selection library in python and r, J. Mach. Learn. Res. 23 (202), 1-7,
2022.
-
[32] J. Zhu, C. Wen, J. Zhu, H. Zhang and X. Wang, A polynomial algorithm for bestsubset
selection problem, Proc. Natl. Acad. Sci. 117 (51), 33117-33123, 2020.
-
[33] H. Zou, The adaptive lasso and its oracle properties, J. Am. Stat. Assoc. 101 (476),
1418-1429, 2006.
-
[34] H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R.
Stat. Soc. Ser. B Stat. Methodol. 67 (2), 301-320, 2005.
-
[35] H. Zou and M. Yuan, Composite quantile regression and the oracle model selection
theory, Ann. Stat. 36 (3), 1108-1126, 2008.
-
[36] H. Zou and H.H. Zhang, On the adaptive elastic-net with a diverging number of
parameters, Ann. Stat. 37 (4), 1733, 2009.
Robust variable selection via the weighted elastic-net multi-step screening procedure
Year 2025,
Volume: 54 Issue: 6, 2350 - 2362, 30.12.2025
Yunlu Jıang
,
Huijie Lu
,
Xiaowen Huang
,
Ruizhe Jiang
Abstract
Variable selection in high-dimensional data remains a critically important, yet challenging task, particularly when confronted with highly correlated features and outliers. In this paper, we propose a novel robust variable selection method via the weighted elastic-net multi-step screening procedure. Our proposed method is not only robust to heavy-tailed errors or high leverage points, but can also handle highly correlated covariates and high-dimensional data sets with $p>n$, where $p$ is the number of predictors and $n$ is the sample size. In addition, a multi-step iterative algorithm is introduced to obtain the proposed estimator. Finally, extensive numerical simulations and a real-world NASDAQ index tracking application are conducted to illustrate the merits of the proposed method. The results indicate that our proposed method has a better finite-sample performance than some existing methods when there exist highly correlated covariates and outliers in the high-dimensional linear regression model.
Ethical Statement
The authors declare no conflict of interest related to this study.
Project Number
This work is supported by the National Science Foundation of China (12171203), and the Fundamental Research Funds for the Central Universities (23JNQMX21)
Thanks
The authors would like to express their heartfelt gratitude to the reviewers for their valuable feedback and insightful suggestions, which have greatly enhanced the quality of this work.
References
-
[1] A. Alfons, C. Croux and S. Gelper, Sparse least trimmed squares regression for analyzing
high-dimensional large data sets, Ann. Appl. Stat., 226-248, 2013.
-
[2] O. Arslan, Weighted LAD-LASSO method for robust parameter estimation and variable
selection in regression, Comput. Stat. Data. An. 56 (6), 1952-1965, 2012.
-
[3] K. Boudt, P.J. Rousseeuw, S. Vanduffel and T. Verdonck, The minimum regularized
covariance determinant estimator, Stat. Comput. 30 (1), 113-128, 2020.
-
[4] J. Bradic, J. Fan and W. Wang, Penalized composite quasi-likelihood for ultrahigh
dimensional variable selection, J. R. Stat. Soc. Ser. B Stat. Methodol. 73 (3), 325-
349, 2011.
-
[5] P. Bühlmann, M. Kalisch and M.H. Maathuis, Variable selection in high-dimensional
linear models: partially faithful distributions and the PC-simple algorithm, Biometrika
97 (2), 261-278, 2010.
-
[6] X. Chen, Z.J. Wang and M.J. McKeown, Asymptotic analysis of robust LASSOs in
the presence of noise with large variance, IEEE Trans. Inf. Theory 56 (10), 5131-5149,
2010.
-
[7] H. Cho and P. Fryzlewicz, High dimensional variable selection via tilting, J. R. Stat.
Soc. Ser. B Stat. Methodol. 74 (3), 593-622, 2012.
-
[8] J. Fan, Y. Fan and E. Barut, Adaptive robust variable selection, Ann. Stat. 42 (1),
324, 2014.
-
[9] J. Fan and R. Li, Nonconcave penalized likelihood with NP-dimensionality, IEEE.
Trans. Inf. Theory. 57 (8), 5467-5484, 2011.
-
[10] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle
properties, J. Amer. Stat. Assoc. 96 (456), 1348-1360, 2001.
-
[11] Y. Jiang, Y. Wang, J. Jiang, B. Xie, J. Liao and W. Liao, Outlier detection and robust
variable selection via the penalized weighted LAD-LASSO method, J. Appl. Stat. 48
(2), 234-246, 2021.
-
[12] Y. Jiang, Y.G. Wang, L. Fu and X. Wang, Robust estimation using modified Huber’s
functions with new tails, Technometrics 61 (1), 111-122, 2019.
-
[13] B.A. Johnson and L. Peng, Rank-based variable selection, J. Nonparam. Stat. 20 (3),
241-252, 2008.
-
[14] R.J. Karunamuni, L. Kong and W. Tu, Efficient robust doubly adaptive regularized
regression with applications, Stat. Methods Med. Res. 28 (7), 2210-2226, 2019.
-
[15] C. Leng, Variable selection and coefficient estimation via regularized rank regression,
Stat. Sin., 167-181, 2010.
-
[16] N. Li, Efficient sparse portfolios based on composite quantile regression for highdimensional
index tracking, J. Stat. Comput. Simul. 90 (8), 1466-1478, 2020.
-
[17] G. Li, H. Peng and L. Zhu, Nonconcave penalized M-estimation with a diverging
number of parameters, Stat. Sin., 391-419, 2011.
-
[18] P. Rousseeuw and K. Van Driessen, A fast algorithm for the minimum covariance
determinant estimator, Technometrics 42 (3), 212-223, 1999.
-
[19] V. Ročková and E.I. George, The spike-and-slab lasso, J. Am. Stat. Assoc. 113 (512),
431-444, 2018.
-
[20] R. Tibshirani, Regression shrinkage and selection via the lasso, J. R. Stat. Soc. Ser.
B Stat. Methodol. 58 (1), 267-288, 1996.
-
[21] X. Wang, Y. Jiang, M. Huang and H. Zhang, Robust variable selection with exponential
squared loss, J. Am. Stat. Assoc. 108 (502), 632-643, 2013.
-
[22] H. Wang, G. Li and G. Jiang, Robust regression shrinkage and consistent variable
selection through the LAD-Lasso, J. Bus. Econ. Stat. 25 (3), 347-355, 2007.
-
[23] L. Wang and R. Li, Weighted Wilcoxon-type smoothly clipped absolute deviation
method, Biometrics 65 (2), 564-571, 2009.
-
[24] Y. Wu and Y. Liu, Variable selection in quantile regression, Stat. Sin., 801-817, 2009.
-
[25] C. Wen, X. Wang and S. Wang, Laplace error penalty-based variable selection in high
dimension, Scand. J. Stat., 42 (3), 685-700, 2015.
-
[26] Y. Xue, J. Ren and B. Yang, Enmsp: an elastic-net multi-step screening procedure
for high-dimensional regression, Stat. Comput. 34 (2), 79, 2024.
-
[27] M. Yuan and Y. Lin, Model selection and estimation in regression with grouped variables,
J. R. Stat. Soc. Ser. B Stat. Methodol. 68 (1), 49-67, 2006.
-
[28] Z. Yang, L. Fu, Y.G. Wang, Z. Dong and Y. Jiang, A robust and efficient variable
selection method for linear regression, J. App. Stat. 49 (14), 3677-3692, 2022.
-
[29] C.H. Zhang, Nearly unbiased variable selection under minimax concave penalty, Ann.
Stat. 38 (2), 894-942, 2010.
-
[30] T. Zhang, Analysis of multi-stage convex relaxation for sparse regularization, J. Mach.
Learn. Res. 11 (3), 2010.
-
[31] J. Zhu, X. Wang, L. Hu, J. Huang, K. Jiang, Y. Zhang, S. Lin and J. Zhu, abess: A
fast best-subset selection library in python and r, J. Mach. Learn. Res. 23 (202), 1-7,
2022.
-
[32] J. Zhu, C. Wen, J. Zhu, H. Zhang and X. Wang, A polynomial algorithm for bestsubset
selection problem, Proc. Natl. Acad. Sci. 117 (51), 33117-33123, 2020.
-
[33] H. Zou, The adaptive lasso and its oracle properties, J. Am. Stat. Assoc. 101 (476),
1418-1429, 2006.
-
[34] H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R.
Stat. Soc. Ser. B Stat. Methodol. 67 (2), 301-320, 2005.
-
[35] H. Zou and M. Yuan, Composite quantile regression and the oracle model selection
theory, Ann. Stat. 36 (3), 1108-1126, 2008.
-
[36] H. Zou and H.H. Zhang, On the adaptive elastic-net with a diverging number of
parameters, Ann. Stat. 37 (4), 1733, 2009.