Year 2021,
Issue: 36, 49 - 63, 30.09.2021
Kumru Urgancı Tekın
,
Burcu Mestav
,
Neslihan İyit
References
- B. M. Bolker, M. E. Brooks, C. J. Clark, S. W. Geange, J. R. Poulsen, M. H. H. Stevens, J. S. S. White, Generalized Linear Mixed Models: A Practical Guide for Ecology and Evolution, Trends in Ecology and Evolution 24 (2009) 127–135.
- O. Komori, S. Eguchi, S. Ikeda, H. Okamura, M. Ichinokawa, S. Nakayama, An Asymmetric Logistic Regression Model for Ecological Data, Methods in Ecology and Evolution 7 (2016) 249–260.
- F. O. Adenkule, A Binary Logistic Regression Model for Prediction of Feed Conversion Ratio of Clarias gariepinus from Feed Composition Data, Mar. Sci. Tech. Bull 10(2) (2021) 134–141.
- M. U. S. Nunes, O. R. Cardoso, M. Soeth, R. A. M. Silvano, L. F. Fa ́varo, Fishers’ Ecological Knowledge on the Reproduction of Fish and Shrimp in a Subtropical Coastal Ecosystem, Hydrobiologia 848 (2021) 929–942.
- D. Pregibon, Resistant Fits for Some Commonly Used Logistic Models with Medical Applications, Biometrics 38(2) (1982) 485–498.
- J. Copas, Binary Regression Models for Contaminated Data, Journal of the Royal Statistical Society Series B (Methodological) 50(2) (1988) 225–265.
- M. Pia, V. Feser, Robust Inference with Binary Data, Psychometrika 67(1) (2002) 21–32.
- A. H. M. Rahmatullah Imon, A. S. Hadi, Identification of Multiple Outliers in Logistic Regression, Communications in Statistics - Theory and Methods 37(11) (2008) 1697–1709.
- A. A. M. Nurunnabi, A. H. M. Rahmatullah Imon, M. Nasser, Identification of Multiple Influential Observations in Logistic Regression, Journal of Applied Statistics 37(10) (2009) 1605–1624.
- S. K. Sarkar, M. Habshah, S. Rana, Detection of Outliers and Influential Observations in Binary Logistic Regression: An Empirical Study, Journal of Applied Sciences 11 (2011) 315–332.
- M. Habshah, S. B. Ariffin, The Performance of Classical and Robust Logistic Regression Estimators in the Presence of Outliers, Pertanika Journal of Science and Technology 20(2) (2012) 313–325.
- C. Leys, M. Delacre, Y. L. Mora, D. Lakens, C. Ley, How to Classify, Detect, and Manage Univariate and Multivariate Outliers, with Emphasis on pre-registration, International Review of Social Psychology 32(1) (2019) 1–10.
- L. Xu, M. Mazur, X. Chen, Y. Chen, Improving the Robustness of Fisheries Stock Assessment Models to Outliers in Input Data, Fisheries Research 230 (2020).
- S. Nargis, Robust Methods in Logistic Regression, Unpublished Master Thesis, University of Canberra, (2005) Bruce ACT, Australia.
- C. Croux, C. Flandre, G. Haesbroeck, The Breakdown Behavior of the Maximum Likelihood Estimator in the Logistic Regression Model, Statistics & Probability Letters 60(4) (2002) 377–386.
- S. Ahmad, M. Norazan, H. Midi, Robust Estimators in Logistic Regression: A Comparative Simulation Study, Journal of Modern Applied Statistical Methods 9(2) (2010) 502–511.
- H. Aguinis, R. K. Gottfredson, H. Joo, Best-Practice Recommendations for Defining, Identifying, and Handling Outliers, Organizational Research Methods 16(2) (2013) 270–301.
- F. R. Hampel, E. M. Ronchetti, P. J. Rousseuw, W. A. Stahel, Robust statistics. The Approach Based on Influence Functions, John Wiley & Sons, New York, NY, 1986.
- H. Midi, S. B. Ariffin, Modified Standardized Pearson Residual for the Identification of Outliers in Logistic Regression Model, Journal of Applied Sciences 13 (2013) 828–836.
- D. Pregibon, Logistic Regression Diagnostics, The Annals of Statistics 9(4) (1981) 705–724.
- L. A. Stefanski, R. J. Carroll, D. Ruppert, Optimally Bounded Score Functions for Generalized Linear Models with Applications to Logistic Regression, Biometrika 73(2) (1986) 413–424.
- H. R. Künsch, L. A. Stefanski, R. J. Carroll, Conditionally Unbiased Bounded Influence Estimation in General Regression Models with Applications to Generalized Linear Models, Journal of the American Statistical Association 84(406) (1989) 460–466.
- R. Carroll, S. Pederson, On Robust Estimation in the Logistic Regression Model, Journal of the Royal Statistical Society Series B (Methodological) 55(3) (1993) 693–706.
- A. Christmann, Least Median of Weighted Squares in Logistic Regression with Large Strata, Biometrika 81(2) (1994) 413–417.
- A. Bianco, V. J. Yohai, Robust Estimation in the Logistic Regression Model, Robust Statistics, Data Analysis, and Computer Intensive Methods (1996) 17–34.
- E. Cantoni, E. Ronchetti, Robust Inference for Generalized Linear Models, Journal of the American Statistical Association 96(455) (2001) 1022–1030.
- C. Croux, G. Haesbroeck, Implementing the Bianco and Yohai estimator for Logistic Regression, Computational Statistics & Data Analysis 44(1-2) (2003) 273–295.
- P. J. Rousseeuw, A. Christmann, Robustness Against Separation and Outliers in Logistic Regression, Computational Statistics & Data Analysis 43(3) (2003) 315–332.
- H. Bondel, Minimum Distance Estimation for the Logistic Regression Model, Biometrika 92(3) (2005) 724–731.
- P. Čížek, Robust and Efficient Adaptive Estimation of Binary-Choice Regression Models, Journal of the American Statistical Association 103(482) (2008) 687–696.
- M. Valdora, V. J. Yohai, Robust Estimators for Generalized Linear Models, Journal of Statistical Planning and Inference 146 (2014) 31–48.
- G. Adimari, L. Ventura, Robust Inference for Generalized Linear Models with Application to Logistic Regression, Statistics & Probability 55(4) (2001) 413–419.
- I. A. I. Ahmed, W. Cheng, The Performance of Robust Methods in Logistic Regression Model, Scientific Research Publishing 10 (2020) 127–138.
- T. Parlak, Lojistik Regresyonda Robust Tahmin Yöntemlerinin Kullanılması, Yüksek Lisans Tezi, Ankara Üniversitesi (2019), Ankara, Türkiye.
- K. I. Penny, I. T. Jolliffe, A Comparison of Multivariate Outlier Detection Methods for Clinical Laboratory Safety Data, Journal of the Royal Statistical Society: Series D (The Statistician) 50(3) (2001) 295–308.
- M. Šimecková, Maximum Weighted Likelihood Estimator in Logistic Regression, WDS'05 Proceedings of Contributed Papers Part I (2005) 144–148.
- B. D. Meyer, N. Mittag, Misclassification in Binary Choice Models, Journal of Econometrics 200(2) (2017) 295–311.
- R. W. M. Wedderburn, Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss-Newton method, Biometrika 61(3) (1974) 439–447.
- R. A. Maronna, R. D. Martin, V. J. Yohai, M. Salibián-Barrera, Robust Statistics: Theory and Methods with R, John Wiley & Sons, New York, NY, 2019.
- M. Krzyśko, Ł. Smaga, Selected Robust Logistic Regression Specification for Classification of Multi-dimensional Functional Data in presence of Outlier, Folia Oeconomica 2(334) (2018) 53–66.
- P. J. Rousseeuw, A. M. Leroy, Robust Regression and Outlier Detection, John Wiley & Sons, New York, NY, 1987.
- R Development Core Team, R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing, 2008.
- J. Wang, R. Zamar, A. Marazzi, V. Yohai, M. Salibian-Barrera, R. Maronna, E. Zivot, D. Rocke, D. Martin, M. Maechler, K. Konis, Package “robust”. R-Project, March 8 2020.
- M. Maechler, P. Rousseeuw, C. Croux, V. Todorov, A. Ruckstuhl, M. S. Barrera, T. Verbeke, M. Koller, E. L. T. Conceicao, M. A. di Palma, Package “robustbase”, R-Project, March 23, 2020.
Robust Logistic Modelling for Datasets with Unusual Points
Year 2021,
Issue: 36, 49 - 63, 30.09.2021
Kumru Urgancı Tekın
,
Burcu Mestav
,
Neslihan İyit
Abstract
Unusual Points (UPs) occur for different reasons, such as an observational error or the presence of a phenomenon with unknown cause. Influential Points (IPs), one of the UPs, have a negative effect on parameter estimation in the Logistic Regression model. Many researchers in fisheries sciences face this problem and have recourse to some manipulations to overcome this problem. The limitations of these manipulations have prompted researchers to use more suitable and innovative estimation techniques to deal with the problem. In this study, we examine the classification accuracies and parameter estimation performances of the Maximum Likelihood (ML) estimator and robust estimators through modified real datasets and simulation experiments. Besides, we discuss the potential applicability of the assessed robust estimators to the estimation models when the IPs are kept in the dataset. The obtained results show that the Weighted Maximum Likelihood (WML) and Weighted Bianco-Yohai (WBY) estimators of robust estimators outperform the others.
References
- B. M. Bolker, M. E. Brooks, C. J. Clark, S. W. Geange, J. R. Poulsen, M. H. H. Stevens, J. S. S. White, Generalized Linear Mixed Models: A Practical Guide for Ecology and Evolution, Trends in Ecology and Evolution 24 (2009) 127–135.
- O. Komori, S. Eguchi, S. Ikeda, H. Okamura, M. Ichinokawa, S. Nakayama, An Asymmetric Logistic Regression Model for Ecological Data, Methods in Ecology and Evolution 7 (2016) 249–260.
- F. O. Adenkule, A Binary Logistic Regression Model for Prediction of Feed Conversion Ratio of Clarias gariepinus from Feed Composition Data, Mar. Sci. Tech. Bull 10(2) (2021) 134–141.
- M. U. S. Nunes, O. R. Cardoso, M. Soeth, R. A. M. Silvano, L. F. Fa ́varo, Fishers’ Ecological Knowledge on the Reproduction of Fish and Shrimp in a Subtropical Coastal Ecosystem, Hydrobiologia 848 (2021) 929–942.
- D. Pregibon, Resistant Fits for Some Commonly Used Logistic Models with Medical Applications, Biometrics 38(2) (1982) 485–498.
- J. Copas, Binary Regression Models for Contaminated Data, Journal of the Royal Statistical Society Series B (Methodological) 50(2) (1988) 225–265.
- M. Pia, V. Feser, Robust Inference with Binary Data, Psychometrika 67(1) (2002) 21–32.
- A. H. M. Rahmatullah Imon, A. S. Hadi, Identification of Multiple Outliers in Logistic Regression, Communications in Statistics - Theory and Methods 37(11) (2008) 1697–1709.
- A. A. M. Nurunnabi, A. H. M. Rahmatullah Imon, M. Nasser, Identification of Multiple Influential Observations in Logistic Regression, Journal of Applied Statistics 37(10) (2009) 1605–1624.
- S. K. Sarkar, M. Habshah, S. Rana, Detection of Outliers and Influential Observations in Binary Logistic Regression: An Empirical Study, Journal of Applied Sciences 11 (2011) 315–332.
- M. Habshah, S. B. Ariffin, The Performance of Classical and Robust Logistic Regression Estimators in the Presence of Outliers, Pertanika Journal of Science and Technology 20(2) (2012) 313–325.
- C. Leys, M. Delacre, Y. L. Mora, D. Lakens, C. Ley, How to Classify, Detect, and Manage Univariate and Multivariate Outliers, with Emphasis on pre-registration, International Review of Social Psychology 32(1) (2019) 1–10.
- L. Xu, M. Mazur, X. Chen, Y. Chen, Improving the Robustness of Fisheries Stock Assessment Models to Outliers in Input Data, Fisheries Research 230 (2020).
- S. Nargis, Robust Methods in Logistic Regression, Unpublished Master Thesis, University of Canberra, (2005) Bruce ACT, Australia.
- C. Croux, C. Flandre, G. Haesbroeck, The Breakdown Behavior of the Maximum Likelihood Estimator in the Logistic Regression Model, Statistics & Probability Letters 60(4) (2002) 377–386.
- S. Ahmad, M. Norazan, H. Midi, Robust Estimators in Logistic Regression: A Comparative Simulation Study, Journal of Modern Applied Statistical Methods 9(2) (2010) 502–511.
- H. Aguinis, R. K. Gottfredson, H. Joo, Best-Practice Recommendations for Defining, Identifying, and Handling Outliers, Organizational Research Methods 16(2) (2013) 270–301.
- F. R. Hampel, E. M. Ronchetti, P. J. Rousseuw, W. A. Stahel, Robust statistics. The Approach Based on Influence Functions, John Wiley & Sons, New York, NY, 1986.
- H. Midi, S. B. Ariffin, Modified Standardized Pearson Residual for the Identification of Outliers in Logistic Regression Model, Journal of Applied Sciences 13 (2013) 828–836.
- D. Pregibon, Logistic Regression Diagnostics, The Annals of Statistics 9(4) (1981) 705–724.
- L. A. Stefanski, R. J. Carroll, D. Ruppert, Optimally Bounded Score Functions for Generalized Linear Models with Applications to Logistic Regression, Biometrika 73(2) (1986) 413–424.
- H. R. Künsch, L. A. Stefanski, R. J. Carroll, Conditionally Unbiased Bounded Influence Estimation in General Regression Models with Applications to Generalized Linear Models, Journal of the American Statistical Association 84(406) (1989) 460–466.
- R. Carroll, S. Pederson, On Robust Estimation in the Logistic Regression Model, Journal of the Royal Statistical Society Series B (Methodological) 55(3) (1993) 693–706.
- A. Christmann, Least Median of Weighted Squares in Logistic Regression with Large Strata, Biometrika 81(2) (1994) 413–417.
- A. Bianco, V. J. Yohai, Robust Estimation in the Logistic Regression Model, Robust Statistics, Data Analysis, and Computer Intensive Methods (1996) 17–34.
- E. Cantoni, E. Ronchetti, Robust Inference for Generalized Linear Models, Journal of the American Statistical Association 96(455) (2001) 1022–1030.
- C. Croux, G. Haesbroeck, Implementing the Bianco and Yohai estimator for Logistic Regression, Computational Statistics & Data Analysis 44(1-2) (2003) 273–295.
- P. J. Rousseeuw, A. Christmann, Robustness Against Separation and Outliers in Logistic Regression, Computational Statistics & Data Analysis 43(3) (2003) 315–332.
- H. Bondel, Minimum Distance Estimation for the Logistic Regression Model, Biometrika 92(3) (2005) 724–731.
- P. Čížek, Robust and Efficient Adaptive Estimation of Binary-Choice Regression Models, Journal of the American Statistical Association 103(482) (2008) 687–696.
- M. Valdora, V. J. Yohai, Robust Estimators for Generalized Linear Models, Journal of Statistical Planning and Inference 146 (2014) 31–48.
- G. Adimari, L. Ventura, Robust Inference for Generalized Linear Models with Application to Logistic Regression, Statistics & Probability 55(4) (2001) 413–419.
- I. A. I. Ahmed, W. Cheng, The Performance of Robust Methods in Logistic Regression Model, Scientific Research Publishing 10 (2020) 127–138.
- T. Parlak, Lojistik Regresyonda Robust Tahmin Yöntemlerinin Kullanılması, Yüksek Lisans Tezi, Ankara Üniversitesi (2019), Ankara, Türkiye.
- K. I. Penny, I. T. Jolliffe, A Comparison of Multivariate Outlier Detection Methods for Clinical Laboratory Safety Data, Journal of the Royal Statistical Society: Series D (The Statistician) 50(3) (2001) 295–308.
- M. Šimecková, Maximum Weighted Likelihood Estimator in Logistic Regression, WDS'05 Proceedings of Contributed Papers Part I (2005) 144–148.
- B. D. Meyer, N. Mittag, Misclassification in Binary Choice Models, Journal of Econometrics 200(2) (2017) 295–311.
- R. W. M. Wedderburn, Quasi-Likelihood Functions, Generalized Linear Models, and the Gauss-Newton method, Biometrika 61(3) (1974) 439–447.
- R. A. Maronna, R. D. Martin, V. J. Yohai, M. Salibián-Barrera, Robust Statistics: Theory and Methods with R, John Wiley & Sons, New York, NY, 2019.
- M. Krzyśko, Ł. Smaga, Selected Robust Logistic Regression Specification for Classification of Multi-dimensional Functional Data in presence of Outlier, Folia Oeconomica 2(334) (2018) 53–66.
- P. J. Rousseeuw, A. M. Leroy, Robust Regression and Outlier Detection, John Wiley & Sons, New York, NY, 1987.
- R Development Core Team, R: A Language and Environment for Statistical Computing. Vienna: R Foundation for Statistical Computing, 2008.
- J. Wang, R. Zamar, A. Marazzi, V. Yohai, M. Salibian-Barrera, R. Maronna, E. Zivot, D. Rocke, D. Martin, M. Maechler, K. Konis, Package “robust”. R-Project, March 8 2020.
- M. Maechler, P. Rousseeuw, C. Croux, V. Todorov, A. Ruckstuhl, M. S. Barrera, T. Verbeke, M. Koller, E. L. T. Conceicao, M. A. di Palma, Package “robustbase”, R-Project, March 23, 2020.