Year 2020,
Volume: 11 Issue: 2, 131 - 146, 13.06.2020
Ömür Kaya Kalkan
,
İsmail Çuhadar
Supporting Institution
PAMUKKALE ÜNİVERSİTESİ
Project Number
ADEP-2018KRM002-063
References
- Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item-response model (Research Report 18-21). Princeton, NJ: Educational Testing Service. doi: 10.1002/j.2333-8504.1981.tb01255.x
- Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1-29. doi: 10.18637/jss.v048.i06
- Chiu, C. Y. (2008). Cluster analysis for cognitive diagnosis: Theory and applications (Doctoral dissertation). Retrieved from https://www.ideals.illinois.edu/handle/2142/80055
- Conway, J. M., & Huffcutt, A. I. (2003). A review and evaluation of exploratory factor analysis practices in organizational research. Organizational Research Methods, 6(2), 147-168. doi: 10.1177/1094428103251541
- Culpepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81(4), 1142-1163. doi: 10.1007/s11336-015-9477-6
- de Ayala, R. J. (2009). The theory and practice of item response theory. New York, NY: The Guilford Press.
- DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35(1), 8-26. doi: 10.1177/0146621610377081
- de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45(4), 343-362. doi: 10.1111/j.1745-3984.2008.00069.x
- de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational And Behavioral Statistics, 34(1), 115-130. doi: 10.3102/1076998607309474
- de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199. doi: 10.1007/s11336-011-9207-7
- de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrica, 69(3), 333-353. doi: 10.1007/BF02295640
- de la Torre, J., & Douglas, J. A. (2008). Model evaluation and multiple strategies in cognitive diagnosis: An analysis of fraction subtraction data. Psychometrika, 73(4), 595-624. doi: 10.1007/s11336-008-9063-2
- de la Torre, J., Hong, Y., & Deng, W. (2010). Factors affecting the item parameter estimation and classification accuracy of the DINA model. Journal of Educational Measurement, 47(2), 227-249. doi: 10.1111/j.1745-3984.2010.00110.x
- de la Torre, J., & Lee, Y. S. (2010). A note on the invariance of the dina model parameters. Journal of Educational Measurement, 47(1), 115-127. doi: 10.1111/j.1745-3984.2009.00102.x
- de la Torre, J., & Lee, Y. S. (2013). Evaluating the Wald test for item‐level comparison of saturated and reduced models in cognitive diagnosis. Journal of Educational Measurement, 50(4), 355-373. doi: 10.1111/jedm.12022
- DeMars, C. E. (2007). “Guessing” parameter estimates for multidimensional item response theory models. Educational and Psychological Measurement, 67(3), 433-446. doi: 10.1177/0013164406294778
- Doornik, J. A. (2018). An object-oriented matrix programming language Ox (Version 8.0) [Computer software]. London: Timberlake Consultants Press.
- Finch, H. (2010). Item parameter estimation for the MIRT model: Bias and precision of confirmatory factor analysis-based models. Applied Psychological Measurement, 34(1), 10-26. doi: 10.1177/0146621609336112
- Finch, H., Habing, B. T., & Huynh, H. (2003, April). Comparison of NOHARM and conditional covariance methods of dimensionality assessment. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL.
- Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26(4), 301-321. doi: 10.1111/j.1745-3984.1989.tb00336.x
- Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer.
- Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnosis. Applied Psychological Measurement, 29(4), 262-277. doi: 10.1177/0146621604272623
- Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Common errors and some comment on improved practice. Educational and Psychological Measurement, 66(3), 393-416. doi: 10.1177/0013164405282485
- Hoijtink, H., & Molenaar, I. W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks. Psychometrika, 62(2), 171-189. doi: 10.1007/BF02295273
- Huebner, A., & Wang, C. (2011). A note on comparing examinee classification methods for cognitive diagnosis models. Educational and Psychological Measurement, 71(2), 407-419. doi: 10.1177/0013164410388832
- Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two- and three-parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6(3), 249-260. doi: 10.1177/014662168200600301
- Jackson, D. L., Gillaspy, J. A., & Purc-Stephenson, R. (2009). Reporting practices in confirmatory factor analysis: an overview and some recommendations. Psychological Methods, 14(1), 6-23. doi: 10.1037/a0014694
- Junker, B. W. (2001). On the interplay between nonparametric and parametric IRT, with some thoughts about the future. In A. Boomsma, M. A. J. Van Duijn, &T. A. B. Snijders (Eds.), Essays on item response theory (pp. 274-276). New York, NY: Springer-Verlag.
- Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272. doi: 10.1177/01466210122032064
- Liao, W. W., Ho, R. G., Yen, Y. C., & Cheng, H. C. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality: An International Journal, 40(10), 1679-1694. doi: 10.2224/sbp.2012.40.10.1679
- Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63(3), 509-525. doi: 10.1348/000711009X474502
- Lorenzo-Seva, U., & Ferrando, P. J. (2006). FACTOR: A computer program to fit the exploratory factor analysis model. Behavior Research Methods, Instruments, & Computers, 38(1), 88-91. doi: 10.3758/BF03192753
- Lord, F. M. (2012). Applications of item response theory to practical testing problems. New Jersey, NJ: Lawrence Erlbaum Associates.
- Ma, W., & de la Torre, J. (2020). GDINA: The generalized DINA model framework: R package (Version 2.7.9). Retrieved from https://CRAN.R-project.org/package=GDINA
- Magis, D. (2013). A note on the item information function of the four-parameter logistic model. Applied Psychological Measurement, 37(4), 304-315. doi: 10.1177/0146621613475471
- Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187-212. doi: 10.1007/BF02294535
- Meng, X., Xu, G., Zhang, J., & Tao, J. (2019). Marginalized maximum a posteriori estimation for the four-parameter logistic model under a mixture modelling framework. British Journal of Mathematical and Statistical Psychology, Advanced online publication. doi: 10.1111/bmsp.12185
- Muthén, L. K., & Muthén, B. O. (1998-2017). Mplus user’s guide (8th ed.). Los Angeles, CA: Muthén & Muthén.
- R Core Team. (2017). R: A language and environment for statistical computing [Computer Software]. Vienna, Austria: R Foundation for Statistical Computing.
- Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2019). Package ‘CDM’. Retrieved from https://cran.r-project.org/web/packages/CDM/CDM.pdf
- Rowley, G. L., & Traub, R. E. (1977). Formula scoring, number-right scoring, and test-taking strategy. Journal of Educational Measurement, 14(1), 15-22. doi: 10.1111/j.1745-3984.1977.tb00024.x
- Rulison, K. L., & Loken, E. (2009). I’ve fallen and i can’t get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83-101. doi: 10.1177/0146621608324023
- Svetina, D., Valdivia, A., Underhill, S., Dai, S., & Wang, X. (2017). Parameter recovery in multidimensional item response theory models under complexity and nonnormality. Applied Psychological Measurement, 41(7), 530-544. doi: 10.1177/0146621617707507
- Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of educational measurement, 20(4), 345-354. doi: 10.1111/j.1745-3984.1983.tb00212.x
- Vermunt, J. K., & Magidson, J. (2016). Upgrade manual for latent GOLD 5.1. Belmont, MA: Statistical Innovations Inc.
- Waller, N. G., & Feuerstahler, L. (2017). Bayesian modal estimation of the four-parameter item response model in real, realistic, and idealized data sets. Multivariate behavioral research, 52(3), 350-370. doi: 10.1080/00273171.2017.1292893
- Yakar, L. (2017). Bilişsel tanı ve çok boyutlu madde tepki kuramı modellerinin karşılıklı uyumlarının incelenmesi (Doctoral thesis). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi/
- Yen, Y. C., Ho, R. G., Laio, W. W., Chen, L. J., & Kuo, C. C. (2012). An empirical evaluation of the slip correction in the four parameter logistic models with computerized adaptive testing. Applied Psychological Measurement, 36(2), 75-87. doi: 10.1177/0146621611432862
An Evaluation of 4PL IRT and DINA Models for Estimating Pseudo-Guessing and Slipping Parameters
Year 2020,
Volume: 11 Issue: 2, 131 - 146, 13.06.2020
Ömür Kaya Kalkan
,
İsmail Çuhadar
Abstract
In an achievement test, the examinees with the required knowledge and skill on a test item are expected to answer the item correctly while the examinees with a lack of necessary information on the item are expected to give an incorrect answer. However, an examinee can give a correct answer to the multiple-choice test items through guessing or sometimes give an incorrect response to an easy item due to anxiety or carelessness. Either case may cause a bias estimation of examinee abilities and item parameters. 4PL IRT model and the DINA model can be used to mitigate these negative impacts on the parameter estimations. The current simulation study aims to compare the estimated pseudo-guessing and slipping parameters from the 4PL IRT model and the DINA model under several study conditions. The DINA model was used to simulate the datasets in the study. The study results showed that the bias of the estimated slipping and guessing parameters from both 4PL IRT and DINA models were reasonably small in general although the estimated slipping and guessing parameters were more biased when datasets were analyzed through the 4PL IRT model rather than the DINA model (i.e., the average bias for both guessing and slipping parameters = .00 from DINA model, but .08 from 4PL IRT model). Accordingly, both 4PL IRT and DINA models can be considered for analyzing the datasets contaminated with guessing and slipping effects.
Project Number
ADEP-2018KRM002-063
References
- Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item-response model (Research Report 18-21). Princeton, NJ: Educational Testing Service. doi: 10.1002/j.2333-8504.1981.tb01255.x
- Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1-29. doi: 10.18637/jss.v048.i06
- Chiu, C. Y. (2008). Cluster analysis for cognitive diagnosis: Theory and applications (Doctoral dissertation). Retrieved from https://www.ideals.illinois.edu/handle/2142/80055
- Conway, J. M., & Huffcutt, A. I. (2003). A review and evaluation of exploratory factor analysis practices in organizational research. Organizational Research Methods, 6(2), 147-168. doi: 10.1177/1094428103251541
- Culpepper, S. A. (2016). Revisiting the 4-parameter item response model: Bayesian estimation and application. Psychometrika, 81(4), 1142-1163. doi: 10.1007/s11336-015-9477-6
- de Ayala, R. J. (2009). The theory and practice of item response theory. New York, NY: The Guilford Press.
- DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35(1), 8-26. doi: 10.1177/0146621610377081
- de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45(4), 343-362. doi: 10.1111/j.1745-3984.2008.00069.x
- de la Torre, J. (2009). DINA model and parameter estimation: A didactic. Journal of Educational And Behavioral Statistics, 34(1), 115-130. doi: 10.3102/1076998607309474
- de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76(2), 179-199. doi: 10.1007/s11336-011-9207-7
- de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrica, 69(3), 333-353. doi: 10.1007/BF02295640
- de la Torre, J., & Douglas, J. A. (2008). Model evaluation and multiple strategies in cognitive diagnosis: An analysis of fraction subtraction data. Psychometrika, 73(4), 595-624. doi: 10.1007/s11336-008-9063-2
- de la Torre, J., Hong, Y., & Deng, W. (2010). Factors affecting the item parameter estimation and classification accuracy of the DINA model. Journal of Educational Measurement, 47(2), 227-249. doi: 10.1111/j.1745-3984.2010.00110.x
- de la Torre, J., & Lee, Y. S. (2010). A note on the invariance of the dina model parameters. Journal of Educational Measurement, 47(1), 115-127. doi: 10.1111/j.1745-3984.2009.00102.x
- de la Torre, J., & Lee, Y. S. (2013). Evaluating the Wald test for item‐level comparison of saturated and reduced models in cognitive diagnosis. Journal of Educational Measurement, 50(4), 355-373. doi: 10.1111/jedm.12022
- DeMars, C. E. (2007). “Guessing” parameter estimates for multidimensional item response theory models. Educational and Psychological Measurement, 67(3), 433-446. doi: 10.1177/0013164406294778
- Doornik, J. A. (2018). An object-oriented matrix programming language Ox (Version 8.0) [Computer software]. London: Timberlake Consultants Press.
- Finch, H. (2010). Item parameter estimation for the MIRT model: Bias and precision of confirmatory factor analysis-based models. Applied Psychological Measurement, 34(1), 10-26. doi: 10.1177/0146621609336112
- Finch, H., Habing, B. T., & Huynh, H. (2003, April). Comparison of NOHARM and conditional covariance methods of dimensionality assessment. Paper presented at the annual meeting of the American Educational Research Association, Chicago, IL.
- Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26(4), 301-321. doi: 10.1111/j.1745-3984.1989.tb00336.x
- Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer.
- Henson, R., & Douglas, J. (2005). Test construction for cognitive diagnosis. Applied Psychological Measurement, 29(4), 262-277. doi: 10.1177/0146621604272623
- Henson, R. K., & Roberts, J. K. (2006). Use of exploratory factor analysis in published research: Common errors and some comment on improved practice. Educational and Psychological Measurement, 66(3), 393-416. doi: 10.1177/0013164405282485
- Hoijtink, H., & Molenaar, I. W. (1997). A multidimensional item response model: Constrained latent class analysis using the Gibbs sampler and posterior predictive checks. Psychometrika, 62(2), 171-189. doi: 10.1007/BF02295273
- Huebner, A., & Wang, C. (2011). A note on comparing examinee classification methods for cognitive diagnosis models. Educational and Psychological Measurement, 71(2), 407-419. doi: 10.1177/0013164410388832
- Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two- and three-parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6(3), 249-260. doi: 10.1177/014662168200600301
- Jackson, D. L., Gillaspy, J. A., & Purc-Stephenson, R. (2009). Reporting practices in confirmatory factor analysis: an overview and some recommendations. Psychological Methods, 14(1), 6-23. doi: 10.1037/a0014694
- Junker, B. W. (2001). On the interplay between nonparametric and parametric IRT, with some thoughts about the future. In A. Boomsma, M. A. J. Van Duijn, &T. A. B. Snijders (Eds.), Essays on item response theory (pp. 274-276). New York, NY: Springer-Verlag.
- Junker, B. W., & Sijtsma, K. (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory. Applied Psychological Measurement, 25(3), 258-272. doi: 10.1177/01466210122032064
- Liao, W. W., Ho, R. G., Yen, Y. C., & Cheng, H. C. (2012). The four-parameter logistic item response theory model as a robust method of estimating ability despite aberrant responses. Social Behavior and Personality: An International Journal, 40(10), 1679-1694. doi: 10.2224/sbp.2012.40.10.1679
- Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63(3), 509-525. doi: 10.1348/000711009X474502
- Lorenzo-Seva, U., & Ferrando, P. J. (2006). FACTOR: A computer program to fit the exploratory factor analysis model. Behavior Research Methods, Instruments, & Computers, 38(1), 88-91. doi: 10.3758/BF03192753
- Lord, F. M. (2012). Applications of item response theory to practical testing problems. New Jersey, NJ: Lawrence Erlbaum Associates.
- Ma, W., & de la Torre, J. (2020). GDINA: The generalized DINA model framework: R package (Version 2.7.9). Retrieved from https://CRAN.R-project.org/package=GDINA
- Magis, D. (2013). A note on the item information function of the four-parameter logistic model. Applied Psychological Measurement, 37(4), 304-315. doi: 10.1177/0146621613475471
- Maris, E. (1999). Estimating multiple classification latent class models. Psychometrika, 64, 187-212. doi: 10.1007/BF02294535
- Meng, X., Xu, G., Zhang, J., & Tao, J. (2019). Marginalized maximum a posteriori estimation for the four-parameter logistic model under a mixture modelling framework. British Journal of Mathematical and Statistical Psychology, Advanced online publication. doi: 10.1111/bmsp.12185
- Muthén, L. K., & Muthén, B. O. (1998-2017). Mplus user’s guide (8th ed.). Los Angeles, CA: Muthén & Muthén.
- R Core Team. (2017). R: A language and environment for statistical computing [Computer Software]. Vienna, Austria: R Foundation for Statistical Computing.
- Robitzsch, A., Kiefer, T., George, A. C., & Uenlue, A. (2019). Package ‘CDM’. Retrieved from https://cran.r-project.org/web/packages/CDM/CDM.pdf
- Rowley, G. L., & Traub, R. E. (1977). Formula scoring, number-right scoring, and test-taking strategy. Journal of Educational Measurement, 14(1), 15-22. doi: 10.1111/j.1745-3984.1977.tb00024.x
- Rulison, K. L., & Loken, E. (2009). I’ve fallen and i can’t get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33(2), 83-101. doi: 10.1177/0146621608324023
- Svetina, D., Valdivia, A., Underhill, S., Dai, S., & Wang, X. (2017). Parameter recovery in multidimensional item response theory models under complexity and nonnormality. Applied Psychological Measurement, 41(7), 530-544. doi: 10.1177/0146621617707507
- Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of educational measurement, 20(4), 345-354. doi: 10.1111/j.1745-3984.1983.tb00212.x
- Vermunt, J. K., & Magidson, J. (2016). Upgrade manual for latent GOLD 5.1. Belmont, MA: Statistical Innovations Inc.
- Waller, N. G., & Feuerstahler, L. (2017). Bayesian modal estimation of the four-parameter item response model in real, realistic, and idealized data sets. Multivariate behavioral research, 52(3), 350-370. doi: 10.1080/00273171.2017.1292893
- Yakar, L. (2017). Bilişsel tanı ve çok boyutlu madde tepki kuramı modellerinin karşılıklı uyumlarının incelenmesi (Doctoral thesis). Retrieved from https://tez.yok.gov.tr/UlusalTezMerkezi/
- Yen, Y. C., Ho, R. G., Laio, W. W., Chen, L. J., & Kuo, C. C. (2012). An empirical evaluation of the slip correction in the four parameter logistic models with computerized adaptive testing. Applied Psychological Measurement, 36(2), 75-87. doi: 10.1177/0146621611432862