Research Article
BibTex RIS Cite

Comparing Differential Item Functioning Based On Multilevel Mixture Item Response Theory, Mixture Item Response Theory And Manifest Groups

Year 2024, Volume: 15 Issue: 2, 120 - 137, 30.06.2024
https://doi.org/10.21031/epod.1457880

Abstract

Studies on the differential item functioning (DIF) are usually considered in the context of manifest groups. Recently, with the increase in the number of analyses conducted with mixture models, investigating the situations that cause differences between groups has come to the forefront. In addition, it is considered important to examine the DIF with mixture models in which levels are also handled. In this study, it is aimed to compare the results of the multilevel mixture item response theory (MMIRT) model and the mixture item response theory (MIRT) model and the results of the DIF analyses based on the manifest groups. The research sample consists of students who answered the second booklet in the electronic Trends in International Mathematics and Science Study (eTIMSS) 2019 and coded their gender. The answers given to 15 items were analyzed with the Mantel Haenszel (MH) method for the gender variable according to the manifest groups, and with the selection of the most appropriate models by varying the number of groups and the number of levels according to the MIRT model and the MMIRT model. DIF analyses of the obtained latent groups were also performed with the MH method. In the light of the findings, the number of items displaying DIF in both the MIRT model and the MMIRT model is higher than the manifest groups. While only one item displayed DIF in the analysis according to gender, 14 items displayed DIF according to the MIRT model and seven items displayed DIF according to the MMIRT model. There is not a complete overlap in the number of DIF items and DIF effect sizes found as a result of the MIRT model and MMIRT model analyses. For this reason, a level analysis should be conducted before the analyses and if there is multi-levelness, the analyses should be conducted by taking this situation into consideration.

References

  • Ackerman, T. A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29, 67–91.
  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (Eds.). (2014). Standards for educational and psychological testing. American Educational Research Association.
  • Aydemir, F. (2023). PISA 2018 matematik ve fen bilimleri alt testlerinde değişen madde fonksiyonunun Rasch Ağacı, Mantel–Haenszel ve Lojistik Regresyon yöntemleriyle incelenmesi. Unpublished master thesis, Gazi University, Ankara.
  • Bauer, D. J., & Curran, P. J. (2003). Distributional assumptions of growth mixture models: implications for overextraction of latent trajectory classes. Psychological Methods, 8(3), 338–363. https://doi.org/10.1037/1082-989X.8.3.338
  • Bayram, Ö. (2024). Bir tutum ölçeği üzerinden Mantel–Haenszel ve sıralı lojistik regresyon yöntemlerine göre değişen madde fonksiyonu incelenmesi. Unpublished master thesis, Kocaeli University, Kocaeli.
  • Büyüköztürk, Ş. (2018). Veri analizi el kitabı: istatistik, araştırma deseni, SPSS uygulamaları ve yorum. Ankara: Pegem Akademi.
  • Cho, S. J., (2007). A multilevel mixture irt model for dif analysis. Unpublished Doctoral Dissertation, University of Georgia.
  • Cho, S.-J., & Cohen, A. S. (2010). A multilevel mixture irt model with an application to dif. Journal of educational and behavioral statistics, 35(3), 336–370. https://doi.org/10.3102/1076998609353111
  • Cho, S.-J., Cohen, A. S., & Kim, S.-H. (2013). Markov chain Monte Carlo estimation of a mixture item response theory model. Journal of Statistical Computation and Simulation, 83, 278–306.
  • Cho, S. J., Suh, Y., & Lee, W. Y. (2015). An NCME instructional module on latent dif analysis using mixture item response models. educational measurement: issues and practice.
  • Choi, Y. J., Alexeev, N. & Cohen, A. S. (2015) Differential item functioning analysis using a mixture 3-parameter logistic model with a covariate on the timss 2007 mathematics test, International Journal of Testing, 15(3), 239-253, DOI: 10.1080/15305058.2015.1007241
  • Cohen, A.S., & Bolt, D.M. (2005). A mixture model analysis of Differential item functioning. Journal of Educational Measurement, 42(2), 133-148.
  • De Ayala, R. J., Kim, S.-H., Stapleton, L. M., & Dayton, C. M. (2002). Differential item functioning: A mixture distribution conceptualization. International Journal of Testing, 2, 243–276.
  • De Boeck, P., Cho, S.-J., & Wilson, M. (2011). Explanatory secondary dimension modeling of latent differential item functioning. Applied Psychological Measurement, 35, 583–603.
  • Dras, L. (2023). Multilevel mixture irt modeling for the analysis of differential item functioning. Unpublished Doctoral dissertation, Brigham Young University.
  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
  • Finch, W. H., & French, B. F. (2012). Parameter estimation with mixture item response theory models: A monte carlo comparison of maximum likelihood and bayesian methods. Journal of Modern Applied Statistical Methods, 11(1), 167-178.
  • Finch, W. H., & Hernández Finch, M. E. (2013). Investigation of Specific Learning Disability and Testing Accommodations Based Differential Item Functioning Using a Multilevel Multidimensional Mixture Item Response Theory Model. Educational and Psychological Measurement, 73(6), 973–993. https://doi.org/10.1177/0013164413494776
  • French, B. F., & Finch, W. H. (2010). Hierarchical logistic regression: Accounting for multilevel data in DIF detection. Journal of Educational Measurement, 47, 299-317.
  • Gurkan, G. (2021). From OLS to multilevel multidimensional mixture IRT: A model refinement approach to investigating patterns of relationships in PISA 2012 data. Unpublished Doctoral Dissertation, Boston, United States of America.
  • Holland, P.W. & Thayer, D.T. (1986). Differential item performance and the Mantel‐Haenszel procedure (Technical Report No. 86–69). Princeton, NJ: Educational Testing Service.
  • Holland, P.W. & Thayer, D.T. (1988) Differential item performance and the Mantel-Haenszel procedure, in Wainer, H. and Braun, H.I. (Eds.): Test Validity, 129–145, Erlbaum, Hillsdale, NJ.
  • Jiao, H., & Chen, Y.-F. (2014). Differential item and testlet functioning. In A. Kunnan (Ed.), The Companion to Language Assessments (pp.1282-1300). John Wiley & Sons, Inc.
  • Jiao, H., Kamata, A., Wang, S. & Jin, Y. (2012). A multilevel testlet model for dual local dependence. Journal of Educational Measurement, 49, 82-100.
  • Kaiser, H.F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141-151.
  • Kelderman, H., & Macready, G.B. (1990). The use of loglinear models for assessing differential item functioning across manifest and latent examinee groups. Journal of Educational Measurement, 27, 307-327.
  • Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.). Guilford Press.
  • Kristanjansonn, E., Aylesworth, R., McDowell, I., & Zumbo, B. D. (2005). A Comparison of four methods for detecting differential item functioning in ordered response model. Educational and Psychological Measurement, 65(6), 935-953.
  • Lee, W. Y., Cho, S. J., & Sterba, S. K. (2018). Ignoring a multilevel structure in mixture item response models: impact on parameter recovery and model selection. Applied psychological measurement, 42(2), 136–154. https://doi.org/10.1177/0146621617711999.
  • Li, F., Cohen, A. S., Kim, S., & Cho, S. (2009). Model selection methods for mixture dichotomous IRT models. Applied Psychological Measurement, 33(5), 353–373. doi: 10.1177/0146621608326422.
  • Lord, F.M. (1980) Applications of item response theory to practical testing problems, Erlbaum, Hillsdale, NJ.
  • Magis, D., Béland, S., Tuerlinckx, F., & De Boeck, P. (2015). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42(3), 847-862. doi:10.3758/BRM.42.3.847.
  • Mantel, N. & Haenszel, W. (1959) Statistical aspects of the analysis of data from retrospective studies of disease, Journal of the National Cancer Institute, 22(4), 719–748.
  • Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American psychologist, 50(9), 741.
  • Mislevy, R. J., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55, 195-215.
  • Muthén, L.K. & Muthén, B.O. (1998-2017). Mplus User’s Guide. Eighth Edition. Los Angeles, CA: Muthén & Muthén.
  • Paek, I., & Cho, S.-J. (2015). A note on parameter estimate comparability: Across latent classes in mixture IRT modeling. Applied Psychological Measurement, 39(2), 135–143. https://doi.org/10.1177/0146621614549651
  • Raju, N.S. (1988). The area between two item characteristic curves, Psychometrika, 53(4), 495–502.
  • Raju, N.S. (1990) Determining the significance of estimated signed and unsigned areas between two item response functions, Applied Psychological Measurement, 14(2), 197–207.
  • Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models. Applications and Data Analysis Methods (2nd ed.). Thousand Oaks, CA: Sage Publications. Revelle, W. (2023). Psych: Procedures for psychological, psychometric, and personality research. (Version 2.3.3). https://cran.r-project.org/web/packages/psych/psych.pdf
  • Revelle, W. (2023). Psych: Procedures for psychological, psychometric, and personality research. (Version 2.3.3). https://cran.r-project.org/web/packages/psych/psych.pdf
  • Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271-282.
  • Roussos, L. & Stout, W. (1996). A multidimensionality-based DIF analysis paradigm. Applied Psychological Measurement, 20(4), 355–371.
  • Samuelsen, K. M. (2005). Examining differential item functioning from a latent class perspective. Unpublished doctoral dissertation, University of Maryland, College Park.
  • Sırgancı, G. (2019). Karma rasch model ile değişen madde fonksiyonunun belirlenmesinde kovaryant (ortak) değişkenin etkisi. Unpublished doctoral dissertation, Ankara University, Ankara.
  • Sen, S. (2022). Mplus ile yapısal eşitlik modellemesi uygulamaları. Ankara: Nobel.
  • Sen, S., Cohen, A., & Kim, S.-H. (2020). A short note on obtaining item parameter estimates of IRT models with Bayesian estimation in Mplus. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 11(3), 266-282. doi: 10.21031/epod.693719
  • Sen, S., & Toker, T. (2021). An application of multilevel mixture item response theory model. Journal of Measurement and Evaluation in Education and Psychology, 12(3), 226-238. doi: 10.21031/epod.893149
  • Shealy, R. and Stout, W. (1993) A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF, Psychometrika, 58(2), 159–194.
  • Thissen, D., Steinberg, L. & Wainer, H. (1988) ‘Use of item response theory in the study of group differences in trace lines’, in Wainer, H. and Braun, H.I. (Eds.): Test Validity, 147–169, Erlbaum, Hillsdale, NJ.
  • Toker, T. & Green, K. (2021). A comparison of latent class analysis and the mixture rasch model using 8th grade mathematics data in the fourth international mathematics and science study (timss-2011), International Journal of Assessment Tools in Education 8(4), 959–974
  • Unal, F. (2023). Farklı oranlardaki kayıp verilere farklı atama yöntemleriyle veri atamanın madde tepki kuramına dayalı yöntemlerle değişen madde fonksiyonuna etkisinin incelenmesi. Unpublished master thesis, Akdeniz University, Antalya.
  • Uyar, Ş. (2015). Gözlenen gruplara ve örtük sınıflara göre belirlenen değişen madde fonksiyonunun karşılaştırılması. Unpublished doctoral dissertation, Hacettepe University, Ankara.
  • Yalcin, S. (2018). Determining differential item functioning with the mixture item response theory. Eurasian Journal of Educational Research 74, 187-206
  • Zhang, Y. (2017). Detection of latent differential item functioing (dif) using mixture 2pl irt model with covariate. Unpublished doctoral dissertation. University of Pittsburgh. Pittsburgh
Year 2024, Volume: 15 Issue: 2, 120 - 137, 30.06.2024
https://doi.org/10.21031/epod.1457880

Abstract

References

  • Ackerman, T. A. (1992). A didactic explanation of item bias, item impact, and item validity from a multidimensional perspective. Journal of Educational Measurement, 29, 67–91.
  • American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (Eds.). (2014). Standards for educational and psychological testing. American Educational Research Association.
  • Aydemir, F. (2023). PISA 2018 matematik ve fen bilimleri alt testlerinde değişen madde fonksiyonunun Rasch Ağacı, Mantel–Haenszel ve Lojistik Regresyon yöntemleriyle incelenmesi. Unpublished master thesis, Gazi University, Ankara.
  • Bauer, D. J., & Curran, P. J. (2003). Distributional assumptions of growth mixture models: implications for overextraction of latent trajectory classes. Psychological Methods, 8(3), 338–363. https://doi.org/10.1037/1082-989X.8.3.338
  • Bayram, Ö. (2024). Bir tutum ölçeği üzerinden Mantel–Haenszel ve sıralı lojistik regresyon yöntemlerine göre değişen madde fonksiyonu incelenmesi. Unpublished master thesis, Kocaeli University, Kocaeli.
  • Büyüköztürk, Ş. (2018). Veri analizi el kitabı: istatistik, araştırma deseni, SPSS uygulamaları ve yorum. Ankara: Pegem Akademi.
  • Cho, S. J., (2007). A multilevel mixture irt model for dif analysis. Unpublished Doctoral Dissertation, University of Georgia.
  • Cho, S.-J., & Cohen, A. S. (2010). A multilevel mixture irt model with an application to dif. Journal of educational and behavioral statistics, 35(3), 336–370. https://doi.org/10.3102/1076998609353111
  • Cho, S.-J., Cohen, A. S., & Kim, S.-H. (2013). Markov chain Monte Carlo estimation of a mixture item response theory model. Journal of Statistical Computation and Simulation, 83, 278–306.
  • Cho, S. J., Suh, Y., & Lee, W. Y. (2015). An NCME instructional module on latent dif analysis using mixture item response models. educational measurement: issues and practice.
  • Choi, Y. J., Alexeev, N. & Cohen, A. S. (2015) Differential item functioning analysis using a mixture 3-parameter logistic model with a covariate on the timss 2007 mathematics test, International Journal of Testing, 15(3), 239-253, DOI: 10.1080/15305058.2015.1007241
  • Cohen, A.S., & Bolt, D.M. (2005). A mixture model analysis of Differential item functioning. Journal of Educational Measurement, 42(2), 133-148.
  • De Ayala, R. J., Kim, S.-H., Stapleton, L. M., & Dayton, C. M. (2002). Differential item functioning: A mixture distribution conceptualization. International Journal of Testing, 2, 243–276.
  • De Boeck, P., Cho, S.-J., & Wilson, M. (2011). Explanatory secondary dimension modeling of latent differential item functioning. Applied Psychological Measurement, 35, 583–603.
  • Dras, L. (2023). Multilevel mixture irt modeling for the analysis of differential item functioning. Unpublished Doctoral dissertation, Brigham Young University.
  • Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum.
  • Finch, W. H., & French, B. F. (2012). Parameter estimation with mixture item response theory models: A monte carlo comparison of maximum likelihood and bayesian methods. Journal of Modern Applied Statistical Methods, 11(1), 167-178.
  • Finch, W. H., & Hernández Finch, M. E. (2013). Investigation of Specific Learning Disability and Testing Accommodations Based Differential Item Functioning Using a Multilevel Multidimensional Mixture Item Response Theory Model. Educational and Psychological Measurement, 73(6), 973–993. https://doi.org/10.1177/0013164413494776
  • French, B. F., & Finch, W. H. (2010). Hierarchical logistic regression: Accounting for multilevel data in DIF detection. Journal of Educational Measurement, 47, 299-317.
  • Gurkan, G. (2021). From OLS to multilevel multidimensional mixture IRT: A model refinement approach to investigating patterns of relationships in PISA 2012 data. Unpublished Doctoral Dissertation, Boston, United States of America.
  • Holland, P.W. & Thayer, D.T. (1986). Differential item performance and the Mantel‐Haenszel procedure (Technical Report No. 86–69). Princeton, NJ: Educational Testing Service.
  • Holland, P.W. & Thayer, D.T. (1988) Differential item performance and the Mantel-Haenszel procedure, in Wainer, H. and Braun, H.I. (Eds.): Test Validity, 129–145, Erlbaum, Hillsdale, NJ.
  • Jiao, H., & Chen, Y.-F. (2014). Differential item and testlet functioning. In A. Kunnan (Ed.), The Companion to Language Assessments (pp.1282-1300). John Wiley & Sons, Inc.
  • Jiao, H., Kamata, A., Wang, S. & Jin, Y. (2012). A multilevel testlet model for dual local dependence. Journal of Educational Measurement, 49, 82-100.
  • Kaiser, H.F. (1960). The application of electronic computers to factor analysis. Educational and Psychological Measurement, 20, 141-151.
  • Kelderman, H., & Macready, G.B. (1990). The use of loglinear models for assessing differential item functioning across manifest and latent examinee groups. Journal of Educational Measurement, 27, 307-327.
  • Kline, R. B. (2016). Principles and practice of structural equation modeling (4th ed.). Guilford Press.
  • Kristanjansonn, E., Aylesworth, R., McDowell, I., & Zumbo, B. D. (2005). A Comparison of four methods for detecting differential item functioning in ordered response model. Educational and Psychological Measurement, 65(6), 935-953.
  • Lee, W. Y., Cho, S. J., & Sterba, S. K. (2018). Ignoring a multilevel structure in mixture item response models: impact on parameter recovery and model selection. Applied psychological measurement, 42(2), 136–154. https://doi.org/10.1177/0146621617711999.
  • Li, F., Cohen, A. S., Kim, S., & Cho, S. (2009). Model selection methods for mixture dichotomous IRT models. Applied Psychological Measurement, 33(5), 353–373. doi: 10.1177/0146621608326422.
  • Lord, F.M. (1980) Applications of item response theory to practical testing problems, Erlbaum, Hillsdale, NJ.
  • Magis, D., Béland, S., Tuerlinckx, F., & De Boeck, P. (2015). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42(3), 847-862. doi:10.3758/BRM.42.3.847.
  • Mantel, N. & Haenszel, W. (1959) Statistical aspects of the analysis of data from retrospective studies of disease, Journal of the National Cancer Institute, 22(4), 719–748.
  • Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons' responses and performances as scientific inquiry into score meaning. American psychologist, 50(9), 741.
  • Mislevy, R. J., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55, 195-215.
  • Muthén, L.K. & Muthén, B.O. (1998-2017). Mplus User’s Guide. Eighth Edition. Los Angeles, CA: Muthén & Muthén.
  • Paek, I., & Cho, S.-J. (2015). A note on parameter estimate comparability: Across latent classes in mixture IRT modeling. Applied Psychological Measurement, 39(2), 135–143. https://doi.org/10.1177/0146621614549651
  • Raju, N.S. (1988). The area between two item characteristic curves, Psychometrika, 53(4), 495–502.
  • Raju, N.S. (1990) Determining the significance of estimated signed and unsigned areas between two item response functions, Applied Psychological Measurement, 14(2), 197–207.
  • Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical Linear Models. Applications and Data Analysis Methods (2nd ed.). Thousand Oaks, CA: Sage Publications. Revelle, W. (2023). Psych: Procedures for psychological, psychometric, and personality research. (Version 2.3.3). https://cran.r-project.org/web/packages/psych/psych.pdf
  • Revelle, W. (2023). Psych: Procedures for psychological, psychometric, and personality research. (Version 2.3.3). https://cran.r-project.org/web/packages/psych/psych.pdf
  • Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14, 271-282.
  • Roussos, L. & Stout, W. (1996). A multidimensionality-based DIF analysis paradigm. Applied Psychological Measurement, 20(4), 355–371.
  • Samuelsen, K. M. (2005). Examining differential item functioning from a latent class perspective. Unpublished doctoral dissertation, University of Maryland, College Park.
  • Sırgancı, G. (2019). Karma rasch model ile değişen madde fonksiyonunun belirlenmesinde kovaryant (ortak) değişkenin etkisi. Unpublished doctoral dissertation, Ankara University, Ankara.
  • Sen, S. (2022). Mplus ile yapısal eşitlik modellemesi uygulamaları. Ankara: Nobel.
  • Sen, S., Cohen, A., & Kim, S.-H. (2020). A short note on obtaining item parameter estimates of IRT models with Bayesian estimation in Mplus. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 11(3), 266-282. doi: 10.21031/epod.693719
  • Sen, S., & Toker, T. (2021). An application of multilevel mixture item response theory model. Journal of Measurement and Evaluation in Education and Psychology, 12(3), 226-238. doi: 10.21031/epod.893149
  • Shealy, R. and Stout, W. (1993) A model-based standardization approach that separates true bias/DIF from group ability differences and detects test bias/DTF as well as item bias/DIF, Psychometrika, 58(2), 159–194.
  • Thissen, D., Steinberg, L. & Wainer, H. (1988) ‘Use of item response theory in the study of group differences in trace lines’, in Wainer, H. and Braun, H.I. (Eds.): Test Validity, 147–169, Erlbaum, Hillsdale, NJ.
  • Toker, T. & Green, K. (2021). A comparison of latent class analysis and the mixture rasch model using 8th grade mathematics data in the fourth international mathematics and science study (timss-2011), International Journal of Assessment Tools in Education 8(4), 959–974
  • Unal, F. (2023). Farklı oranlardaki kayıp verilere farklı atama yöntemleriyle veri atamanın madde tepki kuramına dayalı yöntemlerle değişen madde fonksiyonuna etkisinin incelenmesi. Unpublished master thesis, Akdeniz University, Antalya.
  • Uyar, Ş. (2015). Gözlenen gruplara ve örtük sınıflara göre belirlenen değişen madde fonksiyonunun karşılaştırılması. Unpublished doctoral dissertation, Hacettepe University, Ankara.
  • Yalcin, S. (2018). Determining differential item functioning with the mixture item response theory. Eurasian Journal of Educational Research 74, 187-206
  • Zhang, Y. (2017). Detection of latent differential item functioing (dif) using mixture 2pl irt model with covariate. Unpublished doctoral dissertation. University of Pittsburgh. Pittsburgh
There are 55 citations in total.

Details

Primary Language English
Subjects Statistical Analysis Methods, Item Response Theory
Journal Section Articles
Authors

Ömer Doğan 0000-0001-5169-520X

Burcu Atar 0000-0003-3527-686X

Publication Date June 30, 2024
Submission Date March 24, 2024
Acceptance Date June 6, 2024
Published in Issue Year 2024 Volume: 15 Issue: 2

Cite

APA Doğan, Ö., & Atar, B. (2024). Comparing Differential Item Functioning Based On Multilevel Mixture Item Response Theory, Mixture Item Response Theory And Manifest Groups. Journal of Measurement and Evaluation in Education and Psychology, 15(2), 120-137. https://doi.org/10.21031/epod.1457880