Research Article

A comparative study on the performance of frequentist and Bayesian estimation methods under separation in logistic regression

Volume: 69 Number: 2 December 31, 2020
EN

A comparative study on the performance of frequentist and Bayesian estimation methods under separation in logistic regression

Abstract

Separation is one of the most commonly encountered estimation problems in the context of logistic regression, which often occurs with small and medium sample sizes. The method of maximum likelihood (MLE; Fisher) provides spuriously high parameter estimates and their standard errors under separation in logistic regression. Many researchers in social sciences utilize simple but ad-hoc solutions to overcome this issue, such as "doing nothing strategy", removing variable(s) from the model, and combining the levels of the categorical variable in the data causing separation etc. The limitations of these basic solutions have motivated researchers to use more appropriate and innovative estimation techniques to deal with the problem. However, the performance and comparison of these techniques have not been fully investigated yet. The main goal of this paper is to close this research gap by comparing the performance of frequentist and Bayesian estimation methods for coping with separation. A simulation study is performed to investigate the performance of asymptotic, bootstrap-based, and Bayesian estimation techniques with respect to bias, precision, and accuracy measures under separation. In line with the simulation study, a real-data example is used to illustrate how to utilize these methods to solve separation in logistic regression.

Keywords

References

  1. Albert, A. and Anderson, J. A. On the existence of maximum likelihood estimates in logistic regression models. Biometrica (1984), 71, 1-10.
  2. Betancourt, M. J., Byrne, S., Livingstone, S. and Girolami, M. The Geometric Foundations of Hamiltonian Monte Carlo. ArXiv e-prints 1410.5110, 2014.
  3. Clogg, C. C., Rubin, D. B., Schenker, N., Schultz, B. and Weidman, L. Multiple imputation of industry and occupation codes in census public-use samples using Bayesian logistic regression. Journal of the American Statistical Association (1991), 86, 68-78.
  4. Denwood, M. J. runjags: An R package providing interface utilities, model templates, parallel computing methods and additional distributions for MCMC models in JAGS. Journal of Statistical Software (2016), 71, 1-25.
  5. Duane, S., Kennedy, A. D., Pendleton, B. J. and Roweth, D. Hybrid Monte Carlo. Physics Letters B (1987), 195, 216-222.
  6. Efron, B. and Tibshirani, R. J. An introduction to the bootstrap. New York: Chapman & Hall, 1993. Firth, D. Bias reduction of maximum likelihood estimates. Biometrika (1993), 80, 27-38.
  7. Fisher, R. A. On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society (1922), 222, 309-368.
  8. Gelman, A., Jakulin, A., Pittau, M. G. and Su, Y. A weekly informative prior distribution for logistic and other regression models. Annals of Applied Statistics (2008), 2, 1360-83.

Details

Primary Language

English

Subjects

Applied Mathematics

Journal Section

Research Article

Publication Date

December 31, 2020

Submission Date

September 2, 2019

Acceptance Date

May 29, 2020

Published in Issue

Year 2020 Volume: 69 Number: 2

APA
Altinisik, Y. (2020). A comparative study on the performance of frequentist and Bayesian estimation methods under separation in logistic regression. Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics, 69(2), 1083-1103. https://doi.org/10.31801/cfsuasmas.614492
AMA
1.Altinisik Y. A comparative study on the performance of frequentist and Bayesian estimation methods under separation in logistic regression. Commun. Fac. Sci. Univ. Ank. Ser. A1 Math. Stat. 2020;69(2):1083-1103. doi:10.31801/cfsuasmas.614492
Chicago
Altinisik, Yasin. 2020. “A Comparative Study on the Performance of Frequentist and Bayesian Estimation Methods under Separation in Logistic Regression”. Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics 69 (2): 1083-1103. https://doi.org/10.31801/cfsuasmas.614492.
EndNote
Altinisik Y (December 1, 2020) A comparative study on the performance of frequentist and Bayesian estimation methods under separation in logistic regression. Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics 69 2 1083–1103.
IEEE
[1]Y. Altinisik, “A comparative study on the performance of frequentist and Bayesian estimation methods under separation in logistic regression”, Commun. Fac. Sci. Univ. Ank. Ser. A1 Math. Stat., vol. 69, no. 2, pp. 1083–1103, Dec. 2020, doi: 10.31801/cfsuasmas.614492.
ISNAD
Altinisik, Yasin. “A Comparative Study on the Performance of Frequentist and Bayesian Estimation Methods under Separation in Logistic Regression”. Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics 69/2 (December 1, 2020): 1083-1103. https://doi.org/10.31801/cfsuasmas.614492.
JAMA
1.Altinisik Y. A comparative study on the performance of frequentist and Bayesian estimation methods under separation in logistic regression. Commun. Fac. Sci. Univ. Ank. Ser. A1 Math. Stat. 2020;69:1083–1103.
MLA
Altinisik, Yasin. “A Comparative Study on the Performance of Frequentist and Bayesian Estimation Methods under Separation in Logistic Regression”. Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics, vol. 69, no. 2, Dec. 2020, pp. 1083-0, doi:10.31801/cfsuasmas.614492.
Vancouver
1.Yasin Altinisik. A comparative study on the performance of frequentist and Bayesian estimation methods under separation in logistic regression. Commun. Fac. Sci. Univ. Ank. Ser. A1 Math. Stat. 2020 Dec. 1;69(2):1083-10. doi:10.31801/cfsuasmas.614492

Communications Faculty of Sciences University of Ankara Series A1 Mathematics and Statistics

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.