Research Article
BibTex RIS Cite
Year 2021, , 1572 - 1582, 15.10.2021
https://doi.org/10.15672/hujms.810383

Abstract

References

  • [1] A. Bergesio and V.J. Yohai, Projection estimators for generalized linear models, J. Amer. Statist. Assoc. 106 (494), 661-671, 2011.
  • [2] A.M. Bianco and V.J. Yohai, Robust Estimation in the Logistic Regression Model, Robust Statistics, Data analysis, and Computer Intensive methods, Springer, 1996.
  • [3] Z. Bursac, C.H. Gaussh, D.K. Williams and D.W. Hosmer, Purposeful selection of variables in logistic regression, Source Code Biol. Med. 3 (1), 1-8, 2008.
  • [4] M.H. Chen, J.G. Ibrahim and C. Yiannoutsos, Prior elicitation, variable selection and Bayesian computation for logistic regression models, J. R. Stat. Soc. Ser. B. Stat. Methodol. 61 (1), 223-242, 1999.
  • [5] P. Čížek, Trimmed likelihood-based estimation in binary regression models, Austrian J. Stat. 35 (2&3), 223-232, 2006.
  • [6] P. Číźek, Robust and efficient adaptive estimation of binary-choice regression models, J. Amer. Statist. Assoc. 103 (482), 687-696, 2008.
  • [7] C. Croux, C. Flandre and G. Haesbroeck, The breakdown behavior of the maximum likelihood estimator in the logistic regression model, Statist. Probab. Lett. 60 (4), 377-386, 2002.
  • [8] L. Davies, The asymptotics of Rousseeuw’s minimum volume ellipsoid estimator, Ann. Statist. 20 (4), 1828-1843, 1992.
  • [9] D. Gervini, Robust adaptive estimators for binary regression models, J. Statist. Plann. Inference 131 (2), 297-311, 2005.
  • [10] D. Gervini and V.J. Yohai, A class of robust and fully efficient regression estimators, Ann. Statist. 30 (2), 583-616, 2002.
  • [11] M. Guns and V. Vanacker, Logistic regression applied to natural hazards: Rare event logistic regression with replications, Nat. Hazard Earth Sys. 12 (6), 1937-1947, 2012.
  • [12] Y. Güney, Y. Tuac, S. Özdemir and O. Arslan, Robust estimation and variable selection in heteroscedastic regression model using least favorable distribution, Comput. Statist. 36 (2), 805-827, 2021.
  • [13] D.R. Hunter and K. Lange, Quantile regression via an MM algorithm, J. Comput. Graph. Statist. 9 (1), 60-77, 2000.
  • [14] D.R. Hunter and K. Lange, A tutorial on MM algorithms, Amer. Statist. 58 (1), 30-37, 2004.
  • [15] Y. Jiang, Y.G. Wang, L.Y. Fu and X. Wang, Robust estimation using modified Huber’s functions with new tails, Technometrics 61 (1), 111-122, 2019.
  • [16] R.J. Karunamuni, L.L. Kong and W. Tu, Efficient robust doubly adaptive regularized regression with applications, Stat. Methods Med. Res. 28 (7), 2210-2226, 2019.
  • [17] R.L. Kennedy, A.M. Burton, H.S. Fraser, L.N. McStay and R.F. Harrison, Early diagnosis of acute myocardial infarction using clinical and electrocardiographic data at presentation: Derivation and evaluation of logistic regression models, Eur. Heart J. 17 (8), 1181-1191, 1996.
  • [18] S.K. Kinney and D.B. Dunson, Fixed and random effects selection in linear and logistic models, Biometrics 63 (3), 690-698, 2007.
  • [19] Y. Li and J.S. Liu, Robust variable and interaction selection for logistic regression and general index models, J. Amer. Statist. Assoc. 114 (525), 271-286, 2019.
  • [20] M.A. Little, P.E. McSharry, S.J. Roberts, D.A. Costello and I.M. Moroz, Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection, Biomed. Eng. Online 6 (1), 23, 2007.
  • [21] R.A. Maronna, Robust ridge regression for high-dimensional data, Technometrics 53 (1), 44-53, 2011.
  • [22] L. Meier, S.A. van de Geer and P. Bühlmann, The group lasso for logistic regression, J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 (1), 53-71, 2008.
  • [23] L. Ohno-Machado, Modeling medical prognosis: Survival analysis techniques, J. Biomed. Inform. 34 (6), 428-439, 2001.
  • [24] H. Park and S. Konishi, Robust logistic regression modelling via the elastic net-type regularization and tuning parameter selection, J. Stat. Comput. Simul. 86 (7), 1450- 1461, 2016.
  • [25] D. Pregibon, Logistic regression diagnostics, Ann. Statist. 9 (4), 705-724, 1981.
  • [26] P.J. Rousseeuw, Multivariate estimation with high breakdown point, in: W. Grossmann, G. Pflug, I. Vincze, and W. Wertz (ed.) Mathematical Statistics and Applications, Reidel, 1985.
  • [27] P.J. Rousseeuw and B.C. van Zomeren, Unmasking multivariate outliers and leverage points, J. Amer. Statist. Assoc. 85 (411), 633-639, 1990.
  • [28] L.A. Stefanski, R.J. Carroll and D. Ruppert, Optimally hounded score functions for generalized linear models with applications to logistic regression, Biometrika 73 (2), 413-424, 1986.
  • [29] S. Vinterbo and L. Ohno-Machado, A genetic algorithm to select variables in logistic regression: example in the domain of myocardial infarction, in: Proceedings of the AMIA Symposium, Washington, 984-988, 1999.
  • [30] S. Wang, X.Q. Jiang, Y. Wu, L.J. Cui, S. Cheng and L. Ohno-Machado, Expectation Propagation Logistic Regression (EXPLORER): Distributed privacy-preserving online model learning, J. Biomed. Inform. 46 (3), 480-496, 2013.
  • [31] X. Wang, Y. Jiang, M. Huang and H. Zhang, Robust variable selection with exponential squared loss, J. Amer. Statist. Assoc. 108 (502), 632-643, 2013.
  • [32] F. Xue and A. Qu, Variable selection for highly correlated predictors, arXiv: 1709.04840 [stat.ME].
  • [33] D. Zellner, F. Keller and G.E. Zellner, Variable selection in logistic regression models, Comm. Statist. Simulation Comput. 33 (3), 787-805, 2004.
  • [34] C.X. Zhang, S. Xu and J.S. Zhang, A novel variational Bayesian method for variable selection in logistic regression models, Comput. Statist. Data Anal. 133 (7), 1-19, 2019.

Robust variable selection in the logistic regression model

Year 2021, , 1572 - 1582, 15.10.2021
https://doi.org/10.15672/hujms.810383

Abstract

In this paper, we proposed an adaptive robust variable selection procedure for the logistic regression model. The proposed method is robust to outliers and considers the goodness-of-fit of the regression model. Furthermore, we apply an MM algorithm to solve the proposed optimization problem. Monte Carlo studies are evaluated the finite-sample performance of the proposed method. The results show that when there are outliers in the dataset or the distribution of covariate variable deviates from the normal distribution, the finite-sample performance of the proposed method is better than that of other existing methods.
Finally, the proposed methodology is applied to the data analysis of Parkinson's disease.

References

  • [1] A. Bergesio and V.J. Yohai, Projection estimators for generalized linear models, J. Amer. Statist. Assoc. 106 (494), 661-671, 2011.
  • [2] A.M. Bianco and V.J. Yohai, Robust Estimation in the Logistic Regression Model, Robust Statistics, Data analysis, and Computer Intensive methods, Springer, 1996.
  • [3] Z. Bursac, C.H. Gaussh, D.K. Williams and D.W. Hosmer, Purposeful selection of variables in logistic regression, Source Code Biol. Med. 3 (1), 1-8, 2008.
  • [4] M.H. Chen, J.G. Ibrahim and C. Yiannoutsos, Prior elicitation, variable selection and Bayesian computation for logistic regression models, J. R. Stat. Soc. Ser. B. Stat. Methodol. 61 (1), 223-242, 1999.
  • [5] P. Čížek, Trimmed likelihood-based estimation in binary regression models, Austrian J. Stat. 35 (2&3), 223-232, 2006.
  • [6] P. Číźek, Robust and efficient adaptive estimation of binary-choice regression models, J. Amer. Statist. Assoc. 103 (482), 687-696, 2008.
  • [7] C. Croux, C. Flandre and G. Haesbroeck, The breakdown behavior of the maximum likelihood estimator in the logistic regression model, Statist. Probab. Lett. 60 (4), 377-386, 2002.
  • [8] L. Davies, The asymptotics of Rousseeuw’s minimum volume ellipsoid estimator, Ann. Statist. 20 (4), 1828-1843, 1992.
  • [9] D. Gervini, Robust adaptive estimators for binary regression models, J. Statist. Plann. Inference 131 (2), 297-311, 2005.
  • [10] D. Gervini and V.J. Yohai, A class of robust and fully efficient regression estimators, Ann. Statist. 30 (2), 583-616, 2002.
  • [11] M. Guns and V. Vanacker, Logistic regression applied to natural hazards: Rare event logistic regression with replications, Nat. Hazard Earth Sys. 12 (6), 1937-1947, 2012.
  • [12] Y. Güney, Y. Tuac, S. Özdemir and O. Arslan, Robust estimation and variable selection in heteroscedastic regression model using least favorable distribution, Comput. Statist. 36 (2), 805-827, 2021.
  • [13] D.R. Hunter and K. Lange, Quantile regression via an MM algorithm, J. Comput. Graph. Statist. 9 (1), 60-77, 2000.
  • [14] D.R. Hunter and K. Lange, A tutorial on MM algorithms, Amer. Statist. 58 (1), 30-37, 2004.
  • [15] Y. Jiang, Y.G. Wang, L.Y. Fu and X. Wang, Robust estimation using modified Huber’s functions with new tails, Technometrics 61 (1), 111-122, 2019.
  • [16] R.J. Karunamuni, L.L. Kong and W. Tu, Efficient robust doubly adaptive regularized regression with applications, Stat. Methods Med. Res. 28 (7), 2210-2226, 2019.
  • [17] R.L. Kennedy, A.M. Burton, H.S. Fraser, L.N. McStay and R.F. Harrison, Early diagnosis of acute myocardial infarction using clinical and electrocardiographic data at presentation: Derivation and evaluation of logistic regression models, Eur. Heart J. 17 (8), 1181-1191, 1996.
  • [18] S.K. Kinney and D.B. Dunson, Fixed and random effects selection in linear and logistic models, Biometrics 63 (3), 690-698, 2007.
  • [19] Y. Li and J.S. Liu, Robust variable and interaction selection for logistic regression and general index models, J. Amer. Statist. Assoc. 114 (525), 271-286, 2019.
  • [20] M.A. Little, P.E. McSharry, S.J. Roberts, D.A. Costello and I.M. Moroz, Exploiting nonlinear recurrence and fractal scaling properties for voice disorder detection, Biomed. Eng. Online 6 (1), 23, 2007.
  • [21] R.A. Maronna, Robust ridge regression for high-dimensional data, Technometrics 53 (1), 44-53, 2011.
  • [22] L. Meier, S.A. van de Geer and P. Bühlmann, The group lasso for logistic regression, J. R. Stat. Soc. Ser. B. Stat. Methodol. 70 (1), 53-71, 2008.
  • [23] L. Ohno-Machado, Modeling medical prognosis: Survival analysis techniques, J. Biomed. Inform. 34 (6), 428-439, 2001.
  • [24] H. Park and S. Konishi, Robust logistic regression modelling via the elastic net-type regularization and tuning parameter selection, J. Stat. Comput. Simul. 86 (7), 1450- 1461, 2016.
  • [25] D. Pregibon, Logistic regression diagnostics, Ann. Statist. 9 (4), 705-724, 1981.
  • [26] P.J. Rousseeuw, Multivariate estimation with high breakdown point, in: W. Grossmann, G. Pflug, I. Vincze, and W. Wertz (ed.) Mathematical Statistics and Applications, Reidel, 1985.
  • [27] P.J. Rousseeuw and B.C. van Zomeren, Unmasking multivariate outliers and leverage points, J. Amer. Statist. Assoc. 85 (411), 633-639, 1990.
  • [28] L.A. Stefanski, R.J. Carroll and D. Ruppert, Optimally hounded score functions for generalized linear models with applications to logistic regression, Biometrika 73 (2), 413-424, 1986.
  • [29] S. Vinterbo and L. Ohno-Machado, A genetic algorithm to select variables in logistic regression: example in the domain of myocardial infarction, in: Proceedings of the AMIA Symposium, Washington, 984-988, 1999.
  • [30] S. Wang, X.Q. Jiang, Y. Wu, L.J. Cui, S. Cheng and L. Ohno-Machado, Expectation Propagation Logistic Regression (EXPLORER): Distributed privacy-preserving online model learning, J. Biomed. Inform. 46 (3), 480-496, 2013.
  • [31] X. Wang, Y. Jiang, M. Huang and H. Zhang, Robust variable selection with exponential squared loss, J. Amer. Statist. Assoc. 108 (502), 632-643, 2013.
  • [32] F. Xue and A. Qu, Variable selection for highly correlated predictors, arXiv: 1709.04840 [stat.ME].
  • [33] D. Zellner, F. Keller and G.E. Zellner, Variable selection in logistic regression models, Comm. Statist. Simulation Comput. 33 (3), 787-805, 2004.
  • [34] C.X. Zhang, S. Xu and J.S. Zhang, A novel variational Bayesian method for variable selection in logistic regression models, Comput. Statist. Data Anal. 133 (7), 1-19, 2019.
There are 34 citations in total.

Details

Primary Language English
Subjects Statistics
Journal Section Statistics
Authors

Yunlu Jıang 0000-0001-9047-3079

Jianto Zhang This is me 0000-0003-1270-3494

Yingqiang Huang This is me 0000-0001-5659-8425

Hang Zou This is me 0000-0001-5175-5356

Meilan Huang This is me 0000-0002-1587-4662

Fanhong Chen This is me 0000-0002-6151-0528

Publication Date October 15, 2021
Published in Issue Year 2021

Cite

APA Jıang, Y., Zhang, J., Huang, Y., Zou, H., et al. (2021). Robust variable selection in the logistic regression model. Hacettepe Journal of Mathematics and Statistics, 50(5), 1572-1582. https://doi.org/10.15672/hujms.810383
AMA Jıang Y, Zhang J, Huang Y, Zou H, Huang M, Chen F. Robust variable selection in the logistic regression model. Hacettepe Journal of Mathematics and Statistics. October 2021;50(5):1572-1582. doi:10.15672/hujms.810383
Chicago Jıang, Yunlu, Jianto Zhang, Yingqiang Huang, Hang Zou, Meilan Huang, and Fanhong Chen. “Robust Variable Selection in the Logistic Regression Model”. Hacettepe Journal of Mathematics and Statistics 50, no. 5 (October 2021): 1572-82. https://doi.org/10.15672/hujms.810383.
EndNote Jıang Y, Zhang J, Huang Y, Zou H, Huang M, Chen F (October 1, 2021) Robust variable selection in the logistic regression model. Hacettepe Journal of Mathematics and Statistics 50 5 1572–1582.
IEEE Y. Jıang, J. Zhang, Y. Huang, H. Zou, M. Huang, and F. Chen, “Robust variable selection in the logistic regression model”, Hacettepe Journal of Mathematics and Statistics, vol. 50, no. 5, pp. 1572–1582, 2021, doi: 10.15672/hujms.810383.
ISNAD Jıang, Yunlu et al. “Robust Variable Selection in the Logistic Regression Model”. Hacettepe Journal of Mathematics and Statistics 50/5 (October 2021), 1572-1582. https://doi.org/10.15672/hujms.810383.
JAMA Jıang Y, Zhang J, Huang Y, Zou H, Huang M, Chen F. Robust variable selection in the logistic regression model. Hacettepe Journal of Mathematics and Statistics. 2021;50:1572–1582.
MLA Jıang, Yunlu et al. “Robust Variable Selection in the Logistic Regression Model”. Hacettepe Journal of Mathematics and Statistics, vol. 50, no. 5, 2021, pp. 1572-8, doi:10.15672/hujms.810383.
Vancouver Jıang Y, Zhang J, Huang Y, Zou H, Huang M, Chen F. Robust variable selection in the logistic regression model. Hacettepe Journal of Mathematics and Statistics. 2021;50(5):1572-8.