Type I error and power rates: A comparative analysis of techniques in differential item functioning

Ayşe Bilicioğlu Güneş; Bayram Bıçak

doi:10.21449/ijate.1368341

Research Article

Type I error and power rates: A comparative analysis of techniques in differential item functioning

Year 2023, , 781 - 795, 23.12.2023

Ayşe Bilicioğlu Güneş , Bayram Bıçak

https://doi.org/10.21449/ijate.1368341

Abstract

The main purpose of this study is to examine the Type I error and statistical power ratios of Differential Item Functioning (DIF) techniques based on different theories under different conditions. For this purpose, a simulation study was conducted by using Mantel-Haenszel (MH), Logistic Regression (LR), Lord’s χ2, and Raju’s Areas Measures techniques. In the simulation-based research model, the two-parameter item response model, group’s ability distribution, and DIF type were the fixed conditions while sample size (1800, 3000), rates of sample size (0.50, 1), test length (20, 80) and DIF- containing item rate (0, 0.05, 0.10) were manipulated conditions. The total number of conditions is 24 (2x2x2x3), and statistical analysis was performed in the R software. The current study found that the Type I error rates in all conditions were higher than the nominal error level. It was also demonstrated that MH had the highest error rate while Raju’s Areas Measures had the lowest error rate. Also, MH produced the highest statistical power rates. The analysis of the findings of Type 1 error and statistical power rates illustrated that techniques based on both of the theories performed better in the 1800 sample size. Furthermore, the increase in the sample size affected techniques based on CTT rather than IRT. Also, the findings demonstrated that the techniques’ Type 1 error rates were lower while their statistical power rates were higher under conditions where the test length was 80, and the sample sizes were not equal.

Keywords

Classical test theory, Item response theory, Differential item functioning, Type I error, Statistical power

References

Ankenmann, R.D., Witt, E.A., & Dunbar, S.B. (1999). An investigation of the power of the likelihood ratio goodness-of-fit statistics in detecting differential item functioning. Journal of Educational Measurement, 36(4), 277–300. https://doi.org/10.1111/j.1745-3984.1999.tb00558.x
Atalay Kabasakal, K., Arsan, N., Gök, B., & Kelecioğlu, H. (2014). Değişen madde fonksiyonunun belirlenmesinde MTK olabilirlik oranı, SIBTEST ve mantel- haenszel yöntemlerinin performanslarının (I. Tip hata ve güç) karşılaştırılması [Comparison of the performance (Type I error and power) of the IRT likelihood ratio, SIBTEST, and mantel-haenszel techniques in determining the differential item functioning]. Educational Sciences: Theory & Practice, 14(6), 2175- 2193.
Atar, B. (2007). Differential item functioning analyses for mixed response data using IRT likelihood-ratio test, logistic regression, and GLLAMM procedures (FSU_migr_etd-0248) [Doctoral dissertation, Florida State University]. http://purl.flvc.org/fsu/fd/FSU_migr_etd-0248.
Atar, B., & Kamata, A. (2011). Comparison of IRT likelihood ratio test and logistic regression DIF detection procedures. Hacettepe University Journal of Education, (41), 36–47.
Basman, M. (2023). A comparison of the efficacies of differential item functioning detection methods. International Journal of Assessment Tools in Education, 10(1), 145-159. https://doi.org/10.21449/ijate.1135368
Bradley, J.V. (1978). Robustness. British Journal of Mathematical and Statistical Psychology, 31(2), 144–152. http://dx.doi.org/10.1111/j.2044-8317.1978.tb00581.x
Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items. Sage Publications.
Clauser, B.E., & Mazor, K.M. (1998). Using statistical procedure to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17(1), 31‐44. https://doi.org/10.1111/j.1745-3992.1998.tb00619.x
Cohen, A.S., Kim, S.H., & Wollack, J.A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20(1), 15–26.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. CBS College Publishing.
Dainis, A.M. (2008). Methods for identifying differential item and test functioning: An investigation of type 1 error rates and power (3323367) [Doctoral dissertation, James Madıson University]. ProQuest.
DeMars, C.E. (2009). Modification of the mantel-haenszel and logistic regression DIF procedures to incorporate the SIBTEST regression correction. Journal of Educational and Behavioral Statistics, 34 (2), 149- 170.
Desa, Z.N. (2012). Bi-factor multidimensional item response theory modeling for subscores estimation, reliability, and classification (3523517) [Doctoral thesis, University of Kansas]. ProQuest.
Dodeen, H. (2004). The relationship between item parameters and item fit. Journal of Educational Measurement, 41(3), 261- 270.
Dooley, K. (2002). Simulation research methods. In J. Baum (Ed.), Companion to organizations (pp. 829-848). Blackwell
Dorans, N.J., & Holland, P.W. (1993). DIF detection and description: Mantel-haenszel and standardization. In P.W. Holland, & H. Wainer (Eds.), Differential item functioning (pp. 35-66). Lawrence Erlbaum.
Ellis, B.B., & Raju, N.S. (2003). Test and item bias: What they are, what they aren’t, and how to detect them. Educational Resources Information Center (ERIC).
Erdem Keklik, D. (2012). İki kategorili maddelerde tek biçimli değişen madde fonksiyonu belirleme tekniklerinin karşılaştırılması: Bir simülasyon çalışması [Comparison of techniques in detecting uniform differential item functioning in dichotomous items: A simulation study] (311744) [Doctoral thesis, Ankara university]. YÖK, Ulusal Tez Merkezi.
Fidalgo, A.M., Mellenberg, G.J., & Muniz, J. (2000). Effects of amount of DIF, test length, and purification type on robustness and power of Mantel-Haenszel procedures. Methods of Psychological Research Online, 5(3), 43–53.
Finch, W.H., & French, B.F. (2007). Detection of crossing differential item functioning a comparison of four methods. Educational and Psychological Measurement, 67(4), 565- 582. https://doi.org/10.1177/0013164406296975
Gierl, M.J., Jodoin, M.G., & Ackerman, T.A. (2000). Performance of mantel-haenszel, simultaneous item bias test, and logistic regression when the proportion of DIF items is large. The Annual Meeting of the American Educational Research Association.
Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Sage Publications.
Harwell, M., Stone, C.A., Hsu, T.C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101 125. https://doi.org/10.1177/0146621696020002
Hauck Filho, N., Machado, W.D.L., & Damásio, B.F. (2014). Effects of statistical models and items difficulties on making trait-level inferences: A simulation study. Psicologia: Reflexão e Crítica, 27(4), 670- 678. https://doi.org/10.1590/1678-7153.201427407
Hidalgo, M.D., & Lopez-Pina, J.A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and mantel-haenszel procedures. Educational and Psychological Measurement, 64(6), 903 915. https://doi.org/10.1177/0013164403261769
Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H.I. Braun (Ed.), Test validity (pp.129-145). Erlbaum.
Jodoin, M.G., & Gierl, M.J. (2010). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Psychological Measurement, 14 (4), 329- 349. https://doi.org/10.1207/S15324818AME1404_2
Kan, A., Sünbül, Ö., & Ömür, S. (2013). 6. - 8. sınıf seviye belirleme sınavları alt testlerinin çeşitli yöntemlere göre değişen madde fonksiyonlarının incelenmesi [Analysis of 6th - 8th grade placement exams subtests' differential item functioning by various methods]. Mersin University Journal of the Faculty of Education, 9(2), 207- 222.
Karasar, N. (2010). Bilimsel araştırma yöntemleri [Research methods]. Nobel Publication.
Kim, J. (2010). Controlling type I error rate in evaluating differential item functioning for four DIF methods: Use of three procedures for adjustment of multiple item testing [Doctoral thesis, Georgia State University]. https://doi.org/10.57709/1642363
Koğar, H. (2018). An examination of parametric and nonparametric dimensionality assessment methods with exploratory and confirmatory models. Journal of Education and Learning, 7(3), 148-158. 10.5539/jel.v7n3p148
Kristjansson, E. (2001). Detecting DIF in polytomous items: an empirical comparison of the ordinal logistic regression, logistic discriminant function analysis, Mantel, and generalized Mantel Haenszel procedures [Unpublished Doctoral Dissertation]. University of Ottawa.
Kristjansson, E., Aylesworth, R., Mcdowell, I., & Zumbo, B.D. (2005). Comparison of four methods for detecting differential item functioning in ordered response items. Educational and Psychological Measurement, 65(6), 935 953. https://doi.org/10.1177/0013164405275668
Lim, R.G., & Drasgow, F. (1990). Evaluation of two methods for estimating item response theory parameters when assessing differential item functioning. Journal of Applied Psychology, 75(2), 164-174. https://doi.org/10.1037/0021-9010.75.2.164
Lord, F.M. (2012). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates.
Magis, D., Beland, B., & Raiche, G. (2018). difR: Collection of methods to detect dichotomous differential item functioning (DIF). https://cran.r project.org/web/packages/difR/difR.pdf
Magis, D., & De Boeck, P. (2012). A robust outlier approach to prevent type I error inflation in differential item functioning. Educational and Psychological Measurement, 72(2), 291-311.
Mellenbergh, G.J. (1983). Conditional item bias methods. In S.H. Irvine & J.W. Berry (Ed.), Human assessment and cultural factors (pp. 293-302). Springer.
Narayanan, P., & Swaminathan, H. (1994). Performance of the mantel-haenszel and simultaneous item bias procedures for detecting differential. Applied Psychological Measurement, 18(4), 315-328. https://doi.org/10.1177/014662169401800403
Narayanan, P., & Swaminathan, H. (1996). Identification of items that show nonuniform DIF. Applied Psychological Measurement, 20(3), 257 274. https://doi.org/10.1177/014662169602000306
Osterlind, S.J. (1983). Test item bias. Sage Publications.
Osterlind, S.J., & Everson, H.T. (2009). Differential item functioning. Sage Publications.
Patton, M.Q. (1990). Qualitative evaluation and research methods. Sage Publications, Inc.
Price, E.A. (2014). Item Discrimination, model-data fit, and type I error rates in DIF detection using lord’s χ2, the likelihood ratio test, and the mantel-haenszel procedure [Doctoral thesis, Ohio University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1395842816
Raju, N.S. (1988). The area between two item characteristic curves. Psychometrica, 53(4), 495- 502. https://doi.org/10.1007/BF02294403
Rizopoulos, D. (2018). Latent trait models under IRT. https://cran.r project.org/web/packages/ltm/ltm.pdf
Rogers, H.J., & Swaminathan, H. (1993). A comparison of logistic regression and mantel-haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17(2), 105-116. https://doi.org/10.1177/014662169301700201
Roussos, L.A., & Stout, W.F. (1996). Simulation studies of the effects of small sample size and studied item parameters on SIBTEST and mantel-haenszel type I error performance. Journal of Educational Measurement, 33(2), 215-230. https://doi.org/10.1111/j.1745-3984.1996.tb00490.x
Samuelsen, K.M. (2005). Examining differential item functioning from a latent class perspective (3175148) [Doctoral thesis, University of Maryland]. PreQuest.
Shepard, L., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria. Journal of Educational Statistics, 6(4), 317-375. https://doi.org/10.3102/10769986006004317
Simon, J.L. (1978). Basic research methods in social science. Random House.
Sünbül, Ö., & Ömür Sünbül, S. (2016). Değişen madde fonksiyonunun belirlenmesinde kullanılan yöntemlerde I. tip hata ve güç çalışması [Type I error and power study in methods used to determine differential item functioning]. Elementary Education Online, 15(3), 882- 897.
Swaminathan, H., & Rogers, H.J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361 370. https://www.jstor.org/stable/1434855
Vaughn, B.K., & Wang, Q. (2010). DIF trees: Using classification trees to detect differential item functioning. Educational and Psychological Measurement, 70(6) 941–952. https://doi.org/10.1177/0013164410379326
Wang, W.C., & Su, Y.H. (2004). Factors influencing the Mantel and generalized Mantel-Haenszel methods for the assessment of differential item functioning in polytomous items. Applied Psychological Measurement, 28, 450–481.
Wang, W., Tay, L., & Drasgow, F. (2013). Detecting differential item functioning of polytomous items for an ideal point response process. Applied Psychological Measurement, 37(4), 316- 335. https://doi.org/10.1177/0146621613476156
Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Directorate of Human Resources Research and Evaluation, Department of National Defense.
Zwick, R., Donoghue, J.R., & Grima, A. (1993). Assessment of differential item functioning for performance tasks. Journal of Educational Measurement, 30, 233–251.
Zwick, R., Thayer, D.T., & Mazzeo, J. (1997). Describing and categorizing DIF in polytomous items. Educational Testing Service.

Type I error and power rates: A comparative analysis of techniques in differential item functioning

Year 2023, , 781 - 795, 23.12.2023

Ayşe Bilicioğlu Güneş , Bayram Bıçak

https://doi.org/10.21449/ijate.1368341

Abstract

Keywords

Classical test theory, Item response theory, Differential item functioning, Type I error, Statistical power

References

Ankenmann, R.D., Witt, E.A., & Dunbar, S.B. (1999). An investigation of the power of the likelihood ratio goodness-of-fit statistics in detecting differential item functioning. Journal of Educational Measurement, 36(4), 277–300. https://doi.org/10.1111/j.1745-3984.1999.tb00558.x
Atalay Kabasakal, K., Arsan, N., Gök, B., & Kelecioğlu, H. (2014). Değişen madde fonksiyonunun belirlenmesinde MTK olabilirlik oranı, SIBTEST ve mantel- haenszel yöntemlerinin performanslarının (I. Tip hata ve güç) karşılaştırılması [Comparison of the performance (Type I error and power) of the IRT likelihood ratio, SIBTEST, and mantel-haenszel techniques in determining the differential item functioning]. Educational Sciences: Theory & Practice, 14(6), 2175- 2193.
Atar, B. (2007). Differential item functioning analyses for mixed response data using IRT likelihood-ratio test, logistic regression, and GLLAMM procedures (FSU_migr_etd-0248) [Doctoral dissertation, Florida State University]. http://purl.flvc.org/fsu/fd/FSU_migr_etd-0248.
Atar, B., & Kamata, A. (2011). Comparison of IRT likelihood ratio test and logistic regression DIF detection procedures. Hacettepe University Journal of Education, (41), 36–47.
Basman, M. (2023). A comparison of the efficacies of differential item functioning detection methods. International Journal of Assessment Tools in Education, 10(1), 145-159. https://doi.org/10.21449/ijate.1135368
Bradley, J.V. (1978). Robustness. British Journal of Mathematical and Statistical Psychology, 31(2), 144–152. http://dx.doi.org/10.1111/j.2044-8317.1978.tb00581.x
Camilli, G., & Shepard, L.A. (1994). Methods for identifying biased test items. Sage Publications.
Clauser, B.E., & Mazor, K.M. (1998). Using statistical procedure to identify differential item functioning test items. Educational Measurement: Issues and Practice, 17(1), 31‐44. https://doi.org/10.1111/j.1745-3992.1998.tb00619.x
Cohen, A.S., Kim, S.H., & Wollack, J.A. (1996). An investigation of the likelihood ratio test for detection of differential item functioning. Applied Psychological Measurement, 20(1), 15–26.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. CBS College Publishing.
Dainis, A.M. (2008). Methods for identifying differential item and test functioning: An investigation of type 1 error rates and power (3323367) [Doctoral dissertation, James Madıson University]. ProQuest.
DeMars, C.E. (2009). Modification of the mantel-haenszel and logistic regression DIF procedures to incorporate the SIBTEST regression correction. Journal of Educational and Behavioral Statistics, 34 (2), 149- 170.
Desa, Z.N. (2012). Bi-factor multidimensional item response theory modeling for subscores estimation, reliability, and classification (3523517) [Doctoral thesis, University of Kansas]. ProQuest.
Dodeen, H. (2004). The relationship between item parameters and item fit. Journal of Educational Measurement, 41(3), 261- 270.
Dooley, K. (2002). Simulation research methods. In J. Baum (Ed.), Companion to organizations (pp. 829-848). Blackwell
Dorans, N.J., & Holland, P.W. (1993). DIF detection and description: Mantel-haenszel and standardization. In P.W. Holland, & H. Wainer (Eds.), Differential item functioning (pp. 35-66). Lawrence Erlbaum.
Ellis, B.B., & Raju, N.S. (2003). Test and item bias: What they are, what they aren’t, and how to detect them. Educational Resources Information Center (ERIC).
Erdem Keklik, D. (2012). İki kategorili maddelerde tek biçimli değişen madde fonksiyonu belirleme tekniklerinin karşılaştırılması: Bir simülasyon çalışması [Comparison of techniques in detecting uniform differential item functioning in dichotomous items: A simulation study] (311744) [Doctoral thesis, Ankara university]. YÖK, Ulusal Tez Merkezi.
Fidalgo, A.M., Mellenberg, G.J., & Muniz, J. (2000). Effects of amount of DIF, test length, and purification type on robustness and power of Mantel-Haenszel procedures. Methods of Psychological Research Online, 5(3), 43–53.
Finch, W.H., & French, B.F. (2007). Detection of crossing differential item functioning a comparison of four methods. Educational and Psychological Measurement, 67(4), 565- 582. https://doi.org/10.1177/0013164406296975
Gierl, M.J., Jodoin, M.G., & Ackerman, T.A. (2000). Performance of mantel-haenszel, simultaneous item bias test, and logistic regression when the proportion of DIF items is large. The Annual Meeting of the American Educational Research Association.
Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Sage Publications.
Harwell, M., Stone, C.A., Hsu, T.C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101 125. https://doi.org/10.1177/0146621696020002
Hauck Filho, N., Machado, W.D.L., & Damásio, B.F. (2014). Effects of statistical models and items difficulties on making trait-level inferences: A simulation study. Psicologia: Reflexão e Crítica, 27(4), 670- 678. https://doi.org/10.1590/1678-7153.201427407
Hidalgo, M.D., & Lopez-Pina, J.A. (2004). Differential item functioning detection and effect size: A comparison between logistic regression and mantel-haenszel procedures. Educational and Psychological Measurement, 64(6), 903 915. https://doi.org/10.1177/0013164403261769
Holland, P.W., & Thayer, D.T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer & H.I. Braun (Ed.), Test validity (pp.129-145). Erlbaum.
Jodoin, M.G., & Gierl, M.J. (2010). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Psychological Measurement, 14 (4), 329- 349. https://doi.org/10.1207/S15324818AME1404_2
Kan, A., Sünbül, Ö., & Ömür, S. (2013). 6. - 8. sınıf seviye belirleme sınavları alt testlerinin çeşitli yöntemlere göre değişen madde fonksiyonlarının incelenmesi [Analysis of 6th - 8th grade placement exams subtests' differential item functioning by various methods]. Mersin University Journal of the Faculty of Education, 9(2), 207- 222.
Karasar, N. (2010). Bilimsel araştırma yöntemleri [Research methods]. Nobel Publication.
Kim, J. (2010). Controlling type I error rate in evaluating differential item functioning for four DIF methods: Use of three procedures for adjustment of multiple item testing [Doctoral thesis, Georgia State University]. https://doi.org/10.57709/1642363
Koğar, H. (2018). An examination of parametric and nonparametric dimensionality assessment methods with exploratory and confirmatory models. Journal of Education and Learning, 7(3), 148-158. 10.5539/jel.v7n3p148
Kristjansson, E. (2001). Detecting DIF in polytomous items: an empirical comparison of the ordinal logistic regression, logistic discriminant function analysis, Mantel, and generalized Mantel Haenszel procedures [Unpublished Doctoral Dissertation]. University of Ottawa.
Kristjansson, E., Aylesworth, R., Mcdowell, I., & Zumbo, B.D. (2005). Comparison of four methods for detecting differential item functioning in ordered response items. Educational and Psychological Measurement, 65(6), 935 953. https://doi.org/10.1177/0013164405275668
Lim, R.G., & Drasgow, F. (1990). Evaluation of two methods for estimating item response theory parameters when assessing differential item functioning. Journal of Applied Psychology, 75(2), 164-174. https://doi.org/10.1037/0021-9010.75.2.164
Lord, F.M. (2012). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates.
Magis, D., Beland, B., & Raiche, G. (2018). difR: Collection of methods to detect dichotomous differential item functioning (DIF). https://cran.r project.org/web/packages/difR/difR.pdf
Magis, D., & De Boeck, P. (2012). A robust outlier approach to prevent type I error inflation in differential item functioning. Educational and Psychological Measurement, 72(2), 291-311.
Mellenbergh, G.J. (1983). Conditional item bias methods. In S.H. Irvine & J.W. Berry (Ed.), Human assessment and cultural factors (pp. 293-302). Springer.
Narayanan, P., & Swaminathan, H. (1994). Performance of the mantel-haenszel and simultaneous item bias procedures for detecting differential. Applied Psychological Measurement, 18(4), 315-328. https://doi.org/10.1177/014662169401800403
Narayanan, P., & Swaminathan, H. (1996). Identification of items that show nonuniform DIF. Applied Psychological Measurement, 20(3), 257 274. https://doi.org/10.1177/014662169602000306
Osterlind, S.J. (1983). Test item bias. Sage Publications.
Osterlind, S.J., & Everson, H.T. (2009). Differential item functioning. Sage Publications.
Patton, M.Q. (1990). Qualitative evaluation and research methods. Sage Publications, Inc.
Price, E.A. (2014). Item Discrimination, model-data fit, and type I error rates in DIF detection using lord’s χ2, the likelihood ratio test, and the mantel-haenszel procedure [Doctoral thesis, Ohio University]. OhioLINK Electronic Theses and Dissertations Center. http://rave.ohiolink.edu/etdc/view?acc_num=ohiou1395842816
Raju, N.S. (1988). The area between two item characteristic curves. Psychometrica, 53(4), 495- 502. https://doi.org/10.1007/BF02294403
Rizopoulos, D. (2018). Latent trait models under IRT. https://cran.r project.org/web/packages/ltm/ltm.pdf
Rogers, H.J., & Swaminathan, H. (1993). A comparison of logistic regression and mantel-haenszel procedures for detecting differential item functioning. Applied Psychological Measurement, 17(2), 105-116. https://doi.org/10.1177/014662169301700201
Roussos, L.A., & Stout, W.F. (1996). Simulation studies of the effects of small sample size and studied item parameters on SIBTEST and mantel-haenszel type I error performance. Journal of Educational Measurement, 33(2), 215-230. https://doi.org/10.1111/j.1745-3984.1996.tb00490.x
Samuelsen, K.M. (2005). Examining differential item functioning from a latent class perspective (3175148) [Doctoral thesis, University of Maryland]. PreQuest.
Shepard, L., Camilli, G., & Averill, M. (1981). Comparison of procedures for detecting test-item bias with both internal and external ability criteria. Journal of Educational Statistics, 6(4), 317-375. https://doi.org/10.3102/10769986006004317
Simon, J.L. (1978). Basic research methods in social science. Random House.
Sünbül, Ö., & Ömür Sünbül, S. (2016). Değişen madde fonksiyonunun belirlenmesinde kullanılan yöntemlerde I. tip hata ve güç çalışması [Type I error and power study in methods used to determine differential item functioning]. Elementary Education Online, 15(3), 882- 897.
Swaminathan, H., & Rogers, H.J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361 370. https://www.jstor.org/stable/1434855
Vaughn, B.K., & Wang, Q. (2010). DIF trees: Using classification trees to detect differential item functioning. Educational and Psychological Measurement, 70(6) 941–952. https://doi.org/10.1177/0013164410379326
Wang, W.C., & Su, Y.H. (2004). Factors influencing the Mantel and generalized Mantel-Haenszel methods for the assessment of differential item functioning in polytomous items. Applied Psychological Measurement, 28, 450–481.
Wang, W., Tay, L., & Drasgow, F. (2013). Detecting differential item functioning of polytomous items for an ideal point response process. Applied Psychological Measurement, 37(4), 316- 335. https://doi.org/10.1177/0146621613476156
Zumbo, B.D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and likert-type (ordinal) item scores. Directorate of Human Resources Research and Evaluation, Department of National Defense.
Zwick, R., Donoghue, J.R., & Grima, A. (1993). Assessment of differential item functioning for performance tasks. Journal of Educational Measurement, 30, 233–251.
Zwick, R., Thayer, D.T., & Mazzeo, J. (1997). Describing and categorizing DIF in polytomous items. Educational Testing Service.

There are 59 citations in total.

Details

Primary Language	English
Subjects	Measurement Theories and Applications in Education and Psychology, Similation Study
Journal Section	Articles
Authors	Ayşe Bilicioğlu Güneş 0000-0002-1603-8631 Bayram Bıçak 0000-0003-0860-9374
Publication Date	December 23, 2023
Submission Date	September 29, 2023
Published in Issue	Year 2023

Cite

APA	Bilicioğlu Güneş, A., & Bıçak, B. (2023). Type I error and power rates: A comparative analysis of techniques in differential item functioning. International Journal of Assessment Tools in Education, 10(4), 781-795. https://doi.org/10.21449/ijate.1368341

Article Files

Full Text

23823 23825 23824