In this study, we propose a new selection and estimation procedure for the regression coefficients of high-dimensional generalized linear models in which many coefficients have weak effects (or weak signals). Many existing procedures for selection of regression coefficients in generalized linear models in the high-dimensional situation such as Least Absolute Shrinkage and Selection Operator, Elastic-Net, Smoothly Clipped Absolute Deviation, and Minimax Concave Penalty are mainly focused on selecting variables with strong effects. This may result in biased parameter estimation, particularly when the number of weak signals is extremely high relative to strong signals. Therefore, in this work, we propose an algorithm in which a variable selection is performed first and then an efficient post-selection estimation based on a weighted ridge estimators along with Stein-type shrinkage strategies is employed. We compute the biases and mean square errors for the proposed estimators and we prove the oracle properties of the selection procedure. We investigate the performance of the new procedure relative to the existing penalized regression methods by using Monte Carlo simulations. Finally, we illustrate the methodology by performing genome-wide association analysis on a cancer data set.
Generalized linear models high-dimensional data penalized likelihood weak signals post shrinkage estimation
| Primary Language | English |
|---|---|
| Subjects | Computational Statistics, Approximation Theory and Asymptotic Methods |
| Journal Section | Statistics |
| Authors | |
| Early Pub Date | July 14, 2025 |
| Publication Date | August 29, 2025 |
| Submission Date | February 6, 2025 |
| Acceptance Date | June 29, 2025 |
| Published in Issue | Year 2025 Volume: 54 Issue: 4 |