LOJİSTİK REGRESYON MODELİNDE KÖTÜ KALDIRAÇ NOKTALARININ BELİRLENMESİ İÇİN YENİ SAĞLAM EŞİK DEĞERLER
Year 2021,
Volume: 8 Issue: 2, 630 - 650, 27.07.2021
Ebru Gündoğan Aşık
,
Arzu Altin Yavuz
,
Zafer Küçük
Abstract
Yüksek kaldıraç noktası, x uzayının merkezine uzak olan değer olarak adlandırılır. İyi ya da kötü kaldıraç noktaları yüksek kaldıraç noktası olabilir. Kötü kaldıraç noktaları, yanlış sınıflandırılmış gözlemler veya x uzayındaki diğer gözlem değerleri ile uyumsuzluk gösteren aykırı değerlerdir. Kötü kaldıraç noktalarının belirlenmesinde maskeleme ve süpürme problemini ortadan kaldırmak için kullanılan grup silme yöntemi lojistik regresyon modelinde de kullanılmaktadır. Bu çalışmada kötü kaldıraç noktalarının belirlenmesinde literatürde mevcut olan Sapma Bileşenleri (Deviance Component, DEVC) yöntemi için bazı sağlam eşik değerleri önerilmiştir. Yapılan simülasyon çalışması ile sapma bileşenleri yönteminde kullanılması için önerilen sağlam eşik değerlerin literatürde mevcut olan eşik değerden daha iyi sonuçlar verdiği ortaya konmuştur.
References
- Alguraibawi, M., Midi, H., ve Imon, A. H. M. R. (2015). A new robust diagnostic plot for classifying good and bad high leverage points in a multiple linear regression model. Mathematical Problems in Engineering, 2015. https://doi.org/10.1155/2015/279472
- Belsley, D. A., Kuh, E., ve Welsch, R. E. (1980). Regression Diagnostics. https://doi.org/10.1002/0471725153
- Fitrianto, A., ve Midi, H. (2010). Diagnostic-Robust Generalized Potentials for Identifying High Leverage Points in Mediation Analysis. World Applied Sciences Journal, 11(8), 979–987.
- Habshah, M., Norazan, M. R., ve Imon, A. H. M. R. (2009). The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression. Journal of Applied Statistics, 36(5), 507–520. https://doi.org/10.1080/02664760802553463
- Hadi, A. S. (1992). A New Measure of Overall Potential Influence in Linear Regression. Computational Statistics and Data Analysis, 14(1), 1–27. https://doi.org/10.1016/0167-9473(92)90078-T
- Hampel, F. R. (1974). The Influence Curve and its Role in Robust Estimation. Journal of the American Statistical Association, 69(346), 383–393.
- Hastings, C., Mosteller, F., Tukey, J. W., & Winsor, C. P. (1947). Low Moments for Small Samples: A Comparative Study of Order Statistics. The Annals of Mathematical Statistics, 18(3), 413–426. https://doi.org/10.1214/aoms/1177730388
- Hoaglin, D. C.,ve Mosteller, F. (1983). Understanding robust and explonatory data analysis,. New York, NY: John Wiley & Sons.
- Hodges, J. L. J., ve Lehmann, E. L. (1963). Estimates of Location Based on Ranks Tests. Annals of Mathematical Statistics, 34(2), 598–611.
- Huber, P. j. (1953). Robust Estimation of a Location Parameter. The Annals of Mathematical Statistics, 35(1), 73–101. https://doi.org/10.1214/aoms/1177705148
- Imon, A. H. M. . (2006). Identification of High Leverage Points in Logistic Regression. Pakistan Journal of Statistics, 22(2), 147–156.
- Imon, A. H. M. R., ve Hadi, A. S. (2013). Identification of Multiple High Leverage Points in Logistic Regression. Journal of Applied Statistics, 40(12), 2601–2616. https://doi.org/10.1080/02664763.2013.822057
- Maronna, R. A., ve Yohai, V. J. (1901). Estimates of Regression and Scale The Breakdown Point of Simultaneous General M Estimates of Regression and Scale. (July 2015), 1–6. https://doi.org/10.1080/01621459.1991.10475097
- Norazan, M. R., Sanizah, A., ve Habshah, M. (2012). Identifying Bad Leverage Points in Logistic Regression Model Based on Robust Deviance Components. Mathematical Models and Methods in Modern Science, 62–67. Retrieved from http://www.wseas.us/e-library/conferences/2012/Porto/MAMECTIS/MAMECTIS-09.pdf
- Nurunnabi, A. A. M., Imon, A. H. M. R., ve Nasser, M. (2010). Identification of Multiple Influential Observations in Logistic Regression. Journal of Applied Statistics, 37(10), 1605–1624. https://doi.org/10.1080/02664760903104307
- Rousseeuw, P. J., ve Christophe, C. (1993). Alternatives to the Median Absolute Deviation. 88(424).
- Sarkar, K., Midi, H., ve Rana, S. (2011). Detection of outliers and influential observations in binary Logistic regression: An empirical study. Journal of Applied Sciences, 11(1), 26–35. https://doi.org/10.3923/jas.2011.26.35
- Syaiba, B. A., & Habshah, M. (2010). Robust Logistic Diagnostic for the İdentification of High Leverage Points in Logistic Regression Model. Journal of Applied Sciences, 10(23), 3042–3050.
- Yavuz, A. A. (2013). Estimation of the shape parameter of the Weibull distribution using linear regression methods: Non-censored samples. Quality and Reliability Engineering International, 29(8), 1207–1219. https://doi.org/10.1002/qre.1472
NEW ROBUST CUT-OFF VALUES IN DETERMINING BAD LEVERAGE POINTS IN THE LOGISTIC REGRESSION MODEL
Year 2021,
Volume: 8 Issue: 2, 630 - 650, 27.07.2021
Ebru Gündoğan Aşık
,
Arzu Altin Yavuz
,
Zafer Küçük
Abstract
The high leverage point is called the value far from the center of the x space. Good or bad leverage points can be high leverage points. Bad leverage points misclassified observations and outliers that are incompatible with other observation values in the x space. The group deletion method used to eliminate the masking and swamping problem in determining bad leverage points is also used in the logistic regression model. In this study, some robust cut-off points are proposed for the deviation components method available in the literature for determining bad leverage points. In this study, some robust cut-off points are proposed for the deviation components method available in the literature for determining bad leverage points.
References
- Alguraibawi, M., Midi, H., ve Imon, A. H. M. R. (2015). A new robust diagnostic plot for classifying good and bad high leverage points in a multiple linear regression model. Mathematical Problems in Engineering, 2015. https://doi.org/10.1155/2015/279472
- Belsley, D. A., Kuh, E., ve Welsch, R. E. (1980). Regression Diagnostics. https://doi.org/10.1002/0471725153
- Fitrianto, A., ve Midi, H. (2010). Diagnostic-Robust Generalized Potentials for Identifying High Leverage Points in Mediation Analysis. World Applied Sciences Journal, 11(8), 979–987.
- Habshah, M., Norazan, M. R., ve Imon, A. H. M. R. (2009). The performance of diagnostic-robust generalized potentials for the identification of multiple high leverage points in linear regression. Journal of Applied Statistics, 36(5), 507–520. https://doi.org/10.1080/02664760802553463
- Hadi, A. S. (1992). A New Measure of Overall Potential Influence in Linear Regression. Computational Statistics and Data Analysis, 14(1), 1–27. https://doi.org/10.1016/0167-9473(92)90078-T
- Hampel, F. R. (1974). The Influence Curve and its Role in Robust Estimation. Journal of the American Statistical Association, 69(346), 383–393.
- Hastings, C., Mosteller, F., Tukey, J. W., & Winsor, C. P. (1947). Low Moments for Small Samples: A Comparative Study of Order Statistics. The Annals of Mathematical Statistics, 18(3), 413–426. https://doi.org/10.1214/aoms/1177730388
- Hoaglin, D. C.,ve Mosteller, F. (1983). Understanding robust and explonatory data analysis,. New York, NY: John Wiley & Sons.
- Hodges, J. L. J., ve Lehmann, E. L. (1963). Estimates of Location Based on Ranks Tests. Annals of Mathematical Statistics, 34(2), 598–611.
- Huber, P. j. (1953). Robust Estimation of a Location Parameter. The Annals of Mathematical Statistics, 35(1), 73–101. https://doi.org/10.1214/aoms/1177705148
- Imon, A. H. M. . (2006). Identification of High Leverage Points in Logistic Regression. Pakistan Journal of Statistics, 22(2), 147–156.
- Imon, A. H. M. R., ve Hadi, A. S. (2013). Identification of Multiple High Leverage Points in Logistic Regression. Journal of Applied Statistics, 40(12), 2601–2616. https://doi.org/10.1080/02664763.2013.822057
- Maronna, R. A., ve Yohai, V. J. (1901). Estimates of Regression and Scale The Breakdown Point of Simultaneous General M Estimates of Regression and Scale. (July 2015), 1–6. https://doi.org/10.1080/01621459.1991.10475097
- Norazan, M. R., Sanizah, A., ve Habshah, M. (2012). Identifying Bad Leverage Points in Logistic Regression Model Based on Robust Deviance Components. Mathematical Models and Methods in Modern Science, 62–67. Retrieved from http://www.wseas.us/e-library/conferences/2012/Porto/MAMECTIS/MAMECTIS-09.pdf
- Nurunnabi, A. A. M., Imon, A. H. M. R., ve Nasser, M. (2010). Identification of Multiple Influential Observations in Logistic Regression. Journal of Applied Statistics, 37(10), 1605–1624. https://doi.org/10.1080/02664760903104307
- Rousseeuw, P. J., ve Christophe, C. (1993). Alternatives to the Median Absolute Deviation. 88(424).
- Sarkar, K., Midi, H., ve Rana, S. (2011). Detection of outliers and influential observations in binary Logistic regression: An empirical study. Journal of Applied Sciences, 11(1), 26–35. https://doi.org/10.3923/jas.2011.26.35
- Syaiba, B. A., & Habshah, M. (2010). Robust Logistic Diagnostic for the İdentification of High Leverage Points in Logistic Regression Model. Journal of Applied Sciences, 10(23), 3042–3050.
- Yavuz, A. A. (2013). Estimation of the shape parameter of the Weibull distribution using linear regression methods: Non-censored samples. Quality and Reliability Engineering International, 29(8), 1207–1219. https://doi.org/10.1002/qre.1472