Research Article
BibTex RIS Cite
Year 2020, Volume: 11 Issue: 1, 1 - 12, 24.03.2020
https://doi.org/10.21031/epod.531509

Abstract

References

  • Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
  • Crane, P. K., Belle, G., & Larson, E. B. (2004). Test bias in a cognitive test: Differential item functioning in the CASI. Statistics in Medicine, 23(2), 241–256. doi: 10.1002/sim.1713
  • Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. Journal of Educational Measurement, 23(4), 355-368.
  • Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29(4), 278–295. doi: 10.1177/0146621605275728
  • Finch, W. H., & French, B. F. (2007). Detection of crossing differential item functioning: A comparison of four methods. Educational and Psychological Measurement, 67(4), 565–582. doi: 10.1177/0013164406296975
  • Fleishman, J. A., Spector, W. D., & Altman, B. M. (2002). Impact of differential item functioning on age and gender differences in functional disability. Journal of Gerontology: Social Sciences, 57B(5), 275–284.
  • Holland, P. W., & Wainer, H. (1993). Differential Item Functioning. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
  • Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 129-145). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Mazor, K. M., Kanjee, A., & Clauser, B. E. (1995). Using logistic regression and the Mantel-Haenszel with multiple ability estimates to detect differential item functioning. Journal of Educational Measurement, 32(2), 131–144.
  • Muthen, B. O. (1988). Some uses of structural equation modeling in validity studies: Extending IRT to external variables. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 213-238). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Muthén, L. K., & Muthén, B. O. (1998-2010). Mplus user’s guide (6th ed.). Los Angeles, CA: Muthén & Muthén.
  • Oort, F. J. (1998). Simulation study of item bias detection with restricted factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 5(2), 107–124. doi: 10.1080/10705519809540095
  • R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
  • Sari, H. I. & Huggins, A. C. (2014). Differential item functioning detection across two methods of defining group comparisons: Pairwise and composite group comparisons. Educational and Psychological Measurement, 75(4), 648-676. doi: 10.1177/0013164414549764
  • SAS Institute Inc. (2007). SAS® 9.1.3 qualification tools user’s guide. Cary, NC: SAS Institute Inc.
  • Shealy, R., & Stout W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58(2), 159-194.
  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361-370.
  • Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 147-169). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Vaughn, B. K., & Wang, Q. (2010). DIF trees: Using classification trees to detect differential item functioning. Educational and Psychological Measurement, 70(6), 941–952. doi: 10.1177/0013164410379326
  • Wang, W.-C., & Shih, C.-L. (2010). MIMIC methods for assessing differential item functioning in polytomous items. Applied Psychological Measurement, 34(3), 166–180. doi: 10.1177/0146621609355279
  • Wang, W.-C., Shih, C.-L., & Yang, C.-C. (2009). The MIMIC method with scale purification for detecting differential item functioning. Educational and Psychological Measurement, 69(5), 713–731. doi: 10.1177/0013164409332228
  • Woods, C. M. (2009). Evaluation of MIMIC-Model Methods for DIF Testing With Comparison to Two-Group Analysis, Multivariate Behavioral Research,44(1),1–27. doi: 10.1080/00273170802620121
  • Woods, C. M., Oltmanns, T. F., & Turkheimer, E. (2009). Illustration of MIMIC-Model DIF Testing with the Schedule for Nonadaptive and Adaptive Personality, J Psychopathol Behav Assess, 31, 320–330. doi: 10.1007/s10862-008-9118-9
  • Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottowa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.

Performances of MIMIC and Logistic Regression Procedures in Detecting DIF

Year 2020, Volume: 11 Issue: 1, 1 - 12, 24.03.2020
https://doi.org/10.21031/epod.531509

Abstract

In this
study, differential item functioning (DIF) detection performances of multiple
indicators, multiple causes (MIMIC) and logistic regression (LR) methods for
dichotomous data were investigated. Performances of these two methods were
compared by calculating the Type I error rates and power for each simulation
condition. Conditions covered in the study were: sample size (2000 and 4000
respondents), ability distribution of focal group [N(0, 1) and N(-0.5, 1)], and
the percentage of items with DIF (10% and 20%). Ability distributions of the
respondents in the reference group [N(0, 1)], ratio of focal group to reference
group (1:1), test length (30 items), and variation in difficulty parameters
between groups for the items that contain DIF (0.6) were the conditions that
were held constant. When the two methods were compared according to their Type
I error rates, it was concluded that the change in sample size was more
effective for MIMIC method. On the other hand, the change in the percentage of
items with DIF was more effective for LR. When the two methods were compared
according to their power, the most effective variable for both methods was the
sample size.

References

  • Cohen, J., & Cohen, P. (1983). Applied multiple regression/correlation analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
  • Crane, P. K., Belle, G., & Larson, E. B. (2004). Test bias in a cognitive test: Differential item functioning in the CASI. Statistics in Medicine, 23(2), 241–256. doi: 10.1002/sim.1713
  • Dorans, N. J., & Kulick, E. (1986). Demonstrating the utility of the standardization approach to assessing unexpected differential item performance on the Scholastic Aptitude Test. Journal of Educational Measurement, 23(4), 355-368.
  • Finch, H. (2005). The MIMIC model as a method for detecting DIF: Comparison with Mantel-Haenszel, SIBTEST, and the IRT likelihood ratio. Applied Psychological Measurement, 29(4), 278–295. doi: 10.1177/0146621605275728
  • Finch, W. H., & French, B. F. (2007). Detection of crossing differential item functioning: A comparison of four methods. Educational and Psychological Measurement, 67(4), 565–582. doi: 10.1177/0013164406296975
  • Fleishman, J. A., Spector, W. D., & Altman, B. M. (2002). Impact of differential item functioning on age and gender differences in functional disability. Journal of Gerontology: Social Sciences, 57B(5), 275–284.
  • Holland, P. W., & Wainer, H. (1993). Differential Item Functioning. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
  • Holland, P. W., & Thayer, D. T. (1988). Differential item performance and the Mantel-Haenszel procedure. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 129-145). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Mazor, K. M., Kanjee, A., & Clauser, B. E. (1995). Using logistic regression and the Mantel-Haenszel with multiple ability estimates to detect differential item functioning. Journal of Educational Measurement, 32(2), 131–144.
  • Muthen, B. O. (1988). Some uses of structural equation modeling in validity studies: Extending IRT to external variables. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 213-238). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Muthén, L. K., & Muthén, B. O. (1998-2010). Mplus user’s guide (6th ed.). Los Angeles, CA: Muthén & Muthén.
  • Oort, F. J. (1998). Simulation study of item bias detection with restricted factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 5(2), 107–124. doi: 10.1080/10705519809540095
  • R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/.
  • Sari, H. I. & Huggins, A. C. (2014). Differential item functioning detection across two methods of defining group comparisons: Pairwise and composite group comparisons. Educational and Psychological Measurement, 75(4), 648-676. doi: 10.1177/0013164414549764
  • SAS Institute Inc. (2007). SAS® 9.1.3 qualification tools user’s guide. Cary, NC: SAS Institute Inc.
  • Shealy, R., & Stout W. (1993). A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF. Psychometrika, 58(2), 159-194.
  • Swaminathan, H., & Rogers, H. J. (1990). Detecting differential item functioning using logistic regression procedures. Journal of Educational Measurement, 27(4), 361-370.
  • Thissen, D., Steinberg, L., & Wainer, H. (1988). Use of item response theory in the study of group differences in trace lines. In H. Wainer, & H. I. Braun (Eds.), Test validity (pp. 147-169). Hillsdale, NJ: Lawrence Erlbaum Associates.
  • Vaughn, B. K., & Wang, Q. (2010). DIF trees: Using classification trees to detect differential item functioning. Educational and Psychological Measurement, 70(6), 941–952. doi: 10.1177/0013164410379326
  • Wang, W.-C., & Shih, C.-L. (2010). MIMIC methods for assessing differential item functioning in polytomous items. Applied Psychological Measurement, 34(3), 166–180. doi: 10.1177/0146621609355279
  • Wang, W.-C., Shih, C.-L., & Yang, C.-C. (2009). The MIMIC method with scale purification for detecting differential item functioning. Educational and Psychological Measurement, 69(5), 713–731. doi: 10.1177/0013164409332228
  • Woods, C. M. (2009). Evaluation of MIMIC-Model Methods for DIF Testing With Comparison to Two-Group Analysis, Multivariate Behavioral Research,44(1),1–27. doi: 10.1080/00273170802620121
  • Woods, C. M., Oltmanns, T. F., & Turkheimer, E. (2009). Illustration of MIMIC-Model DIF Testing with the Schedule for Nonadaptive and Adaptive Personality, J Psychopathol Behav Assess, 31, 320–330. doi: 10.1007/s10862-008-9118-9
  • Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottowa, ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.
There are 25 citations in total.

Details

Primary Language English
Journal Section Articles
Authors

Seçil Uğurlu 0000-0002-3495-7797

Burcu Atar 0000-0003-3527-686X

Publication Date March 24, 2020
Acceptance Date November 22, 2019
Published in Issue Year 2020 Volume: 11 Issue: 1

Cite

APA Uğurlu, S., & Atar, B. (2020). Performances of MIMIC and Logistic Regression Procedures in Detecting DIF. Journal of Measurement and Evaluation in Education and Psychology, 11(1), 1-12. https://doi.org/10.21031/epod.531509