Post high-dimensional shrinkage estimation for sparse generalized linear models

Seyedeh Zahra Aghamohammadi

doi:10.15672/hujms.1634222

Research Article

Year 2025, Volume: 54 Issue: 4, 1533 - 1562, 29.08.2025

Seyedeh Zahra Aghamohammadi

https://doi.org/10.15672/hujms.1634222

Abstract

References

[1] H. Akaike, A new look at the statistical model identification, IEEE Trans. Autom. Control 19, 716–723, 1974.
[2] A. Belloni, V. Chernozhukov and L. Wang, Square-root lasso: Pivotal recovery of sparse signals via conic programming, Biometrika 98(4), 791–806, 2011.
[3] P. Bickel, Y. Ritov and A. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector, Ann. Stat. 37, 1705–1732, 2009.
[4] P. Bühlmann and S. van de Geer, Statistics for High-Dimensional Data: Methods, Theory and Applications, Springer, Berlin, 2011.
[5] D. L. Donoho, High-dimensional data analysis: the curses and blessings of dimensionality, AMS Math Challenges Lecture, 1–32, 2000.
[6] E. Candès and T. Tao, The Dantzig selector: Statistical estimation when p is much larger than n, Ann. Stat. 35(6), 2313–2351, 2007.
[7] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least angle regression, Ann. Stat. 32(2), 407–499, 2004.
[8] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc. 96(456), 1348–1360, 2001.
[9] J. Fan and R. Li, Statistical challenges with high dimensionality: Feature selection in knowledge discovery, Proc. Int. Cong. Math. III, 595–622, 2006.
[10] J. Fan and J. Lv, A selective overview of variable selection in high dimensional feature space, Stat. Sin. 20(1), 101–148, 2010.
[11] J. Fan and R. Song, Sure independence screening in generalized linear models with np-dimensionality, Ann. Stat. 38, 3567–3604, 2010.
[12] J. Fan and J. Lv, Non-concave penalized likelihood with np-dimensionality, IEEE Trans. Reliab. 57, 5467–5484, 2011.
[13] X. Gao, S.E. Ahmed and Y. Feng, Post selection shrinkage estimation for highdimensional data analysis, Appl. Stoch. Models Bus. Ind. 33, 97–120, 2017.
[14] J. Huang, S. G. Ma and C. H. Zhang, Adaptive Lasso for sparse high-dimensional regression models, Stat. Sin. 18, 1603–1618, 2008.
[15] S. F. Kurnaz, I. Hoffmann and P. Filzmoser, Robust and sparse estimation methods for high-dimensional linear and logistic regression, J. Chemom. 31(1112), e2936, 2017.
[16] S. Kwon and Y. Kim, Large sample properties of the SCAD-penalized maximum likelihood estimation on high dimensions, Stat. Sin. 22, 629–653, 2012.
[17] P. McCullagh and J. A. Nelder, Generalized Linear Models, 2nd ed., Chapman and Hall, London, 1989.
[18] N. Meinshausen and P. Bühlmann, High-dimensional graphs and variable selection with the Lasso, Ann. Stat. 34, 1436–1462, 2006.
[19] D. M. Sakate and D. N. Kashid, Variable selection via penalized minimum udivergence estimation in logistic regression, J. Appl. Stat. 41(6), 1233–1246, 2014.
[20] G. Schwarz, Estimating the dimension of a model, Ann. Stat. 6, 461–464, 1978.
[21] T. Sun and C. H. Zhang, Scaled sparse linear regression, Biometrika 99(4), 879–898, 2012.
[22] R. Tibshirani, Regression shrinkage and selection via the Lasso, J. R. Stat. Soc. Ser. B 58(1), 267–288, 1996.
[23] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu and K. Knight, Sparsity and smoothness via the fused Lasso, J. R. Stat. Soc. Ser. B 67(1), 91–108, 2005.
[24] H. Wang, R. Li and C. L. Tsai, Tuning parameter selectors for the smoothly clipped absolute deviation method, Biometrika 94(3), 553–568, 2007.
[25] M. Wang, L. Song and X. Wang, Bridge estimation for generalized linear models with a diverging number of parameters, Stat. Probab. Lett. 80, 1584–1596, 2010.
[26] X. Wang and M. Wang, Variable selection for high-dimensional generalized linear models with the weighted elastic-net procedure, J. Appl. Stat. 43(5), 796–809, 2016.
[27] P. Zhao and B. Yu, On model selection consistency of Lasso, J. Mach. Learn. Res. 7(11), 2541–2563, 2006.
[28] H. Zou, The adaptive Lasso and its oracle properties, J. Am. Stat. Assoc. 101(476), 1418–1429, 2006.
[29] H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B 67(2), 301–320, 2005.

Post high-dimensional shrinkage estimation for sparse generalized linear models

Year 2025, Volume: 54 Issue: 4, 1533 - 1562, 29.08.2025

Seyedeh Zahra Aghamohammadi

https://doi.org/10.15672/hujms.1634222

Abstract

In this study, we propose a new selection and estimation procedure for the regression coefficients of high-dimensional generalized linear models in which many coefficients have weak effects (or weak signals). Many existing procedures for selection of regression coefficients in generalized linear models in the high-dimensional situation such as Least Absolute Shrinkage and Selection Operator, Elastic-Net, Smoothly Clipped Absolute Deviation, and Minimax Concave Penalty are mainly focused on selecting variables with strong effects. This may result in biased parameter estimation, particularly when the number of weak signals is extremely high relative to strong signals. Therefore, in this work, we propose an algorithm in which a variable selection is performed first and then an efficient post-selection estimation based on a weighted ridge estimators along with Stein-type shrinkage strategies is employed. We compute the biases and mean square errors for the proposed estimators and we prove the oracle properties of the selection procedure. We investigate the performance of the new procedure relative to the existing penalized regression methods by using Monte Carlo simulations. Finally, we illustrate the methodology by performing genome-wide association analysis on a cancer data set.

Keywords

Generalized linear models , high-dimensional data , penalized likelihood , weak signals , post shrinkage estimation

References

[1] H. Akaike, A new look at the statistical model identification, IEEE Trans. Autom. Control 19, 716–723, 1974.
[2] A. Belloni, V. Chernozhukov and L. Wang, Square-root lasso: Pivotal recovery of sparse signals via conic programming, Biometrika 98(4), 791–806, 2011.
[3] P. Bickel, Y. Ritov and A. Tsybakov, Simultaneous analysis of Lasso and Dantzig selector, Ann. Stat. 37, 1705–1732, 2009.
[4] P. Bühlmann and S. van de Geer, Statistics for High-Dimensional Data: Methods, Theory and Applications, Springer, Berlin, 2011.
[5] D. L. Donoho, High-dimensional data analysis: the curses and blessings of dimensionality, AMS Math Challenges Lecture, 1–32, 2000.
[6] E. Candès and T. Tao, The Dantzig selector: Statistical estimation when p is much larger than n, Ann. Stat. 35(6), 2313–2351, 2007.
[7] B. Efron, T. Hastie, I. Johnstone and R. Tibshirani, Least angle regression, Ann. Stat. 32(2), 407–499, 2004.
[8] J. Fan and R. Li, Variable selection via nonconcave penalized likelihood and its oracle properties, J. Am. Stat. Assoc. 96(456), 1348–1360, 2001.
[9] J. Fan and R. Li, Statistical challenges with high dimensionality: Feature selection in knowledge discovery, Proc. Int. Cong. Math. III, 595–622, 2006.
[10] J. Fan and J. Lv, A selective overview of variable selection in high dimensional feature space, Stat. Sin. 20(1), 101–148, 2010.
[11] J. Fan and R. Song, Sure independence screening in generalized linear models with np-dimensionality, Ann. Stat. 38, 3567–3604, 2010.
[12] J. Fan and J. Lv, Non-concave penalized likelihood with np-dimensionality, IEEE Trans. Reliab. 57, 5467–5484, 2011.
[13] X. Gao, S.E. Ahmed and Y. Feng, Post selection shrinkage estimation for highdimensional data analysis, Appl. Stoch. Models Bus. Ind. 33, 97–120, 2017.
[14] J. Huang, S. G. Ma and C. H. Zhang, Adaptive Lasso for sparse high-dimensional regression models, Stat. Sin. 18, 1603–1618, 2008.
[15] S. F. Kurnaz, I. Hoffmann and P. Filzmoser, Robust and sparse estimation methods for high-dimensional linear and logistic regression, J. Chemom. 31(1112), e2936, 2017.
[16] S. Kwon and Y. Kim, Large sample properties of the SCAD-penalized maximum likelihood estimation on high dimensions, Stat. Sin. 22, 629–653, 2012.
[17] P. McCullagh and J. A. Nelder, Generalized Linear Models, 2nd ed., Chapman and Hall, London, 1989.
[18] N. Meinshausen and P. Bühlmann, High-dimensional graphs and variable selection with the Lasso, Ann. Stat. 34, 1436–1462, 2006.
[19] D. M. Sakate and D. N. Kashid, Variable selection via penalized minimum udivergence estimation in logistic regression, J. Appl. Stat. 41(6), 1233–1246, 2014.
[20] G. Schwarz, Estimating the dimension of a model, Ann. Stat. 6, 461–464, 1978.
[21] T. Sun and C. H. Zhang, Scaled sparse linear regression, Biometrika 99(4), 879–898, 2012.
[22] R. Tibshirani, Regression shrinkage and selection via the Lasso, J. R. Stat. Soc. Ser. B 58(1), 267–288, 1996.
[23] R. Tibshirani, M. Saunders, S. Rosset, J. Zhu and K. Knight, Sparsity and smoothness via the fused Lasso, J. R. Stat. Soc. Ser. B 67(1), 91–108, 2005.
[24] H. Wang, R. Li and C. L. Tsai, Tuning parameter selectors for the smoothly clipped absolute deviation method, Biometrika 94(3), 553–568, 2007.
[25] M. Wang, L. Song and X. Wang, Bridge estimation for generalized linear models with a diverging number of parameters, Stat. Probab. Lett. 80, 1584–1596, 2010.
[26] X. Wang and M. Wang, Variable selection for high-dimensional generalized linear models with the weighted elastic-net procedure, J. Appl. Stat. 43(5), 796–809, 2016.
[27] P. Zhao and B. Yu, On model selection consistency of Lasso, J. Mach. Learn. Res. 7(11), 2541–2563, 2006.
[28] H. Zou, The adaptive Lasso and its oracle properties, J. Am. Stat. Assoc. 101(476), 1418–1429, 2006.
[29] H. Zou and T. Hastie, Regularization and variable selection via the elastic net, J. R. Stat. Soc. Ser. B 67(2), 301–320, 2005.

There are 29 citations in total.

Details

Primary Language	English
Subjects	Computational Statistics, Approximation Theory and Asymptotic Methods
Journal Section	Statistics
Authors	Seyedeh Zahra Aghamohammadi 0009-0008-6122-1321
Early Pub Date	July 14, 2025
Publication Date	August 29, 2025
Submission Date	February 6, 2025
Acceptance Date	June 29, 2025
Published in Issue	Year 2025 Volume: 54 Issue: 4

Cite

APA	Aghamohammadi, S. Z. (2025). Post high-dimensional shrinkage estimation for sparse generalized linear models. Hacettepe Journal of Mathematics and Statistics, 54(4), 1533-1562. https://doi.org/10.15672/hujms.1634222
AMA	Aghamohammadi SZ. Post high-dimensional shrinkage estimation for sparse generalized linear models. Hacettepe Journal of Mathematics and Statistics. August 2025;54(4):1533-1562. doi:10.15672/hujms.1634222
Chicago	Aghamohammadi, Seyedeh Zahra. “Post High-Dimensional Shrinkage Estimation for Sparse Generalized Linear Models”. Hacettepe Journal of Mathematics and Statistics 54, no. 4 (August 2025): 1533-62. https://doi.org/10.15672/hujms.1634222.
EndNote	Aghamohammadi SZ (August 1, 2025) Post high-dimensional shrinkage estimation for sparse generalized linear models. Hacettepe Journal of Mathematics and Statistics 54 4 1533–1562.
IEEE	S. Z. Aghamohammadi, “Post high-dimensional shrinkage estimation for sparse generalized linear models”, Hacettepe Journal of Mathematics and Statistics, vol. 54, no. 4, pp. 1533–1562, 2025, doi: 10.15672/hujms.1634222.
ISNAD	Aghamohammadi, Seyedeh Zahra. “Post High-Dimensional Shrinkage Estimation for Sparse Generalized Linear Models”. Hacettepe Journal of Mathematics and Statistics 54/4 (August2025), 1533-1562. https://doi.org/10.15672/hujms.1634222.
JAMA	Aghamohammadi SZ. Post high-dimensional shrinkage estimation for sparse generalized linear models. Hacettepe Journal of Mathematics and Statistics. 2025;54:1533–1562.
MLA	Aghamohammadi, Seyedeh Zahra. “Post High-Dimensional Shrinkage Estimation for Sparse Generalized Linear Models”. Hacettepe Journal of Mathematics and Statistics, vol. 54, no. 4, 2025, pp. 1533-62, doi:10.15672/hujms.1634222.
Vancouver	Aghamohammadi SZ. Post high-dimensional shrinkage estimation for sparse generalized linear models. Hacettepe Journal of Mathematics and Statistics. 2025;54(4):1533-62.

Article Files

Full Text

For more information about the journal, please visit: https://dergipark.org.tr/en/pub/hujms