Multidimensional Computerized Adaptive Testing Simulations in R
Year 2022,
, 118 - 137, 10.03.2022
Fazilet Gül İnce Aracı
,
Şeref Tan
Abstract
Computerized Adaptive Testing (CAT) is a beneficial test technique that decreases the number of items that need to be administered by taking items in accordance with individuals' own ability levels. After the CAT applications were constructed based on the unidimensional Item Response Theory (IRT), Multidimensional CAT (MCAT) applications have gained momentum with the improvement of multidimensional IRT (MIRT) models in recent years. Researchers often benefit from simulation studies in order to design the final adaptive testing application and to test the effectiveness of adaptive testing applications they developed with different methods. Recently, R has become one of the most widely used programming languages in Monte Carlo Simulation studies since it is a free and open-source software. The aims of this study are to present the MCAT simulation process step by step in the R environment, to examine the effects of the conditions that researchers can handle during the simulation process according to two different dimensional models, and to examine the effect of treating multidimensional structures as unidimensional structures on simulation results. In this direction, datasets generated in accordance with within-item dimensionality and between-item dimensionality models, MCAT simulation studies were constructed with different customizations, and MCAT simulation results were compared with unidimensional CAT simulation results. All commands required for each simulation example were explained and results were shared for each condition.
References
- Ackerman, T.A. (1991). The use of unidimensional parameter estimates of multidimensional items in adaptive testing. Applied Psychological Measurement, 15(1), 13-24.
- Aybek, E.C. (2016). Kendini Değerlendirme Envanteri’nin bilgisayar ortamında bireye uyarlanmış test (BOBUT) olarak uygulanabilirliğinin araştırılması [An investigation of applicability of the self assessment inventory as a computerized adaptive test (CAT)] [Doctoral Dissertation, Ankara University]. https://dspace.ankara.edu.tr/xmlui/bitstream/handle/20.500.12575/37233/eren_can_aybek.pdf?sequence=1&isAllowed=y
- Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algortihm. Pschometrika, 46(4), 443-459.
- Bulut, O., & Sünbül, Ö. (2017). Monte carlo simulation studies in item response theory with the R programming language. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287. https://doi.org/10.21031/epod.305821
- Boyd, A.M., Dodd, B.G., & Choi, S.W. (2010). Polytomous models in computerized adaptive testing. In M. L. Nering & R. Ostini (Eds.), Handbook of polytomous item response theory models (pp. 229–255). Routledge.
- Chalmers, R.P. (2015). mirtCAT: Computerized adaptive testing with multidimensional item response theory. R package version 0.6, 1. https://CRAN.Rproject.org/package=mirtCAT
- Chalmers, R.P. (2016). Generating adaptive and non-adaptive test interfaces for multidimensional item response theory applications. Journal of Statistical Software, 71(5), 139. https://doi.org/10.18637/jss.v071.i05
- Chalmers, P., Sigal, M., Oguzhan, O., & Chalmers, M. P. (2020). SimDesign: Structure fororganizing monte carlo simulation designs. R package version 2.2. https://CRAN.R-project.org/package=SimDesign
- Chen, J. (2012). Applying Item Response Theory methods to design a learning progression based science assessment [Unpublished Doctoral Dissertation]. Michigan State University.
- Davey, T., Nering, M. L., & Thompson, T. (1997). Realistic simulation of item response data (Vol. 97, No. 4). ACT, Incorporated.
- De Ayala, R.J. (2009). The theory and practice of item response theory. The Guilford Press.
- Embretson, S.E., & Reise, S.P. (2000). Item response theory for psychologists. Erlbaum.
- Feinberg, R.A., & Rubright, J.D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36 49. https://doi.org/10.1111/emip.12111
- Finkelman, M., Nering, M.L., & Roussos, L.A. (2009). A conditional exposure control method for multidimensional adaptive testing. Journal of Educational Measurement, 46(1), 84103. http://doi.org/10.1111/j.1745-3984.2009.01070.x
- Hambleton, R.K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer Academic Publishers.
- He, W., & Reckase, M.D. (2014). Item pool design for an operational variable-length computerized adaptive test. Educational and Psychological Measurement, 74(3), 473-494. https://doi.org/10.1177/0013164413509629
- Hornik, K., & FAQ, R. (2010). Frequently asked questions on R. The R project for Statistical. https://CRAN.R-project.org/doc/FAQ/RFAQ.html
- Lin, H. (2012). Item selection methods in multidimensional computerized adaptive testing adopting polytomously-scored items under multidımensional generalized partial credit model [Unpublished Doctoral Dissertation, University of Illinois]. https://hdl.handle.net/2142/34534
- Lord, F.M., & Novick, M.R. (1968). Statistical theories of zihinsel test scores. Oxford.
- Magis, D., Yan, D., & von-Davier, A. (Eds.). (2017). Computerized adaptive and multistage testing with R: Using packages catr and mstr. Springer.
- Meneghetti, D.D.R., & Junior, P.T.A. (2017). Application and simulation of computerized adaptive tests through the package catsim. https:// arxiv.org/pdf/1707.03012.pdf
- Mulder, J., & van der Linden, W.J. (2009). Muldimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74(2), 273 296. https://doi.org/10.1007/s11336-008-9097-5
- Nydick, S., & Weiss, D.J. (2009). A hybrid simulation procedure for developments of CATs. In Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. https:// www.iacat.org/sites/default/files/biblio/cat09nydick.pdf
- Paap, M.C., Born, S., & Braeken, J. (2019). Measurement efficiency for fixed-precision multidimensional computerized adaptive tests: Comparing health measurement and educational testing using example banks. Applied Psychological Measurement, 43(1), 68-83. https://doi.org/10.1177/0146621618765719
- R Core Team (2020). R: A language and environment for statistical computing [Computer software manual]. http://www.R-project.org/
- Reckase, M.D. (2009). Multidimensional item response theory: Statistics for social and behavioral sciences. Springer.
- Riggelsen, C. (2008). Learning Bayesian networks: a MAP criterion for joint selection of model structure and parameter. In 2008 Eighth IEEE International Conference on Data Mining (pp. 522-529). IEEE.
- Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika, 61(2), 331-354.
- Segall, D.O. (2001). General ability measurement: An application of multidimensional itemresponse theory. Psychometrika, 66, 79-97.
- Segall, D.O. (2005). Computerized adaptive testing. In K. Kempf-Leonard (Ed.),Encyclopedia of Social Measurement. Academic Press.
- Seo, D.G., & Weiss, D.J. (2015). Best Design for Multidimensional Adaptive Testing With the Bifactor Model. Educational and Psychological Measurement, 75(6), 954-978.
- Su, Y.H. (2016). A comparison of constrained item selection methods in multidimensional computerized adaptive testing. Applied psychological measurement, 40(5), 346-360. https://doi.org/10.1177/0146621616639305
- Team, R. (2020). RStudio: Integrated Development for R (1.3.1073) [Computer software]. RStudio. https://rstudio.com/products/rstudio/
- Thissen, D., & Mislevy, R.J., 2000. Testing algorithms. In H. Wainer (Eds.). Computerized Adaptive Testing. Lawrence Erlbaum Assc.
- Thompson, N.A. (2007). A practitioner’s guide for variable-length computerized classification testing. Practical Assessment, Research & Evaluation, 12(1), 1-13.
- Thompson, N.A., & Weiss, D.J. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation. 16(1). 1-9.
- Van der Linden, W., & Glas, G. A. W. (2002). Computerized adaptive testing: theory and practice. Kluwer Academic Publishers.
- Veerkamp, W.J., & Berger, M.P. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22(2), 203-226. https://doi.org/10.3102/10769986022002203
- Wainer, H., Dorans, N.J., Flaugher, R., Green, B.F., Mislevy, R.J. Steinberg, L., & Thissen, D. (2000). Computerized adaptive testing: a primer. Lawrence Erlbaum.
- Wang, C., & Chen, H. (2004). Implementation and measurement efficiency of multidimensional computerized adaptive testing. Applied Psychological Measurement, 28, 295-316. https://doi.org/10.1177/0146621604265938
- Wang, C. (2015). On latent trait estimation in multidimensional compensatory item response models. Psychometrika, 80(2), 428-449. https://doi.org/10.1007/s11336-013-9399-0
- Weiss, D.J., & Kingsbury, G.G. (1984). Application of computer adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361–375.
- Weiss, D.J., & Gibbons, R.D. (2007). Computerized adaptive testing with the bifactor model. Paper presented at the New CAT Models session at the 2007 GMAC Conference on Computerized Adaptive Testing. https://mail.iacat.org/sites/default/files/biblio/cat07weiss%26gibbons.pdf
- Yao, L. (2013). Comparing the performance of five multidimensional CAT selection procedure with different stopping rules. Applied Psychological Measurement, 37(1), 3-23. https://doi.org/10.1177/0146621612455687
Multidimensional Computerized Adaptive Testing Simulations in R
Year 2022,
, 118 - 137, 10.03.2022
Fazilet Gül İnce Aracı
,
Şeref Tan
Abstract
Computerized Adaptive Testing (CAT) is a beneficial test technique that decreases the number of items that need to be administered by taking items in accordance with individuals' own ability levels. After the CAT applications were constructed based on the unidimensional Item Response Theory (IRT), Multidimensional CAT (MCAT) applications have gained momentum with the improvement of multidimensional IRT (MIRT) models in recent years. Researchers often benefit from simulation studies in order to design the final adaptive testing application and to test the effectiveness of adaptive testing applications they developed with different methods. Recently, R has become one of the most widely used programming languages in Monte Carlo Simulation studies since it is a free and open-source software. The aims of this study are to present the MCAT simulation process step by step in the R environment, to examine the effects of the conditions that researchers can handle during the simulation process according to two different dimensional models, and to examine the effect of treating multidimensional structures as unidimensional structures on simulation results. In this direction, datasets generated in accordance with within-item dimensionality and between-item dimensionality models, MCAT simulation studies were constructed with different customizations, and MCAT simulation results were compared with unidimensional CAT simulation results. All commands required for each simulation example were explained and results were shared for each condition.
References
- Ackerman, T.A. (1991). The use of unidimensional parameter estimates of multidimensional items in adaptive testing. Applied Psychological Measurement, 15(1), 13-24.
- Aybek, E.C. (2016). Kendini Değerlendirme Envanteri’nin bilgisayar ortamında bireye uyarlanmış test (BOBUT) olarak uygulanabilirliğinin araştırılması [An investigation of applicability of the self assessment inventory as a computerized adaptive test (CAT)] [Doctoral Dissertation, Ankara University]. https://dspace.ankara.edu.tr/xmlui/bitstream/handle/20.500.12575/37233/eren_can_aybek.pdf?sequence=1&isAllowed=y
- Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algortihm. Pschometrika, 46(4), 443-459.
- Bulut, O., & Sünbül, Ö. (2017). Monte carlo simulation studies in item response theory with the R programming language. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287. https://doi.org/10.21031/epod.305821
- Boyd, A.M., Dodd, B.G., & Choi, S.W. (2010). Polytomous models in computerized adaptive testing. In M. L. Nering & R. Ostini (Eds.), Handbook of polytomous item response theory models (pp. 229–255). Routledge.
- Chalmers, R.P. (2015). mirtCAT: Computerized adaptive testing with multidimensional item response theory. R package version 0.6, 1. https://CRAN.Rproject.org/package=mirtCAT
- Chalmers, R.P. (2016). Generating adaptive and non-adaptive test interfaces for multidimensional item response theory applications. Journal of Statistical Software, 71(5), 139. https://doi.org/10.18637/jss.v071.i05
- Chalmers, P., Sigal, M., Oguzhan, O., & Chalmers, M. P. (2020). SimDesign: Structure fororganizing monte carlo simulation designs. R package version 2.2. https://CRAN.R-project.org/package=SimDesign
- Chen, J. (2012). Applying Item Response Theory methods to design a learning progression based science assessment [Unpublished Doctoral Dissertation]. Michigan State University.
- Davey, T., Nering, M. L., & Thompson, T. (1997). Realistic simulation of item response data (Vol. 97, No. 4). ACT, Incorporated.
- De Ayala, R.J. (2009). The theory and practice of item response theory. The Guilford Press.
- Embretson, S.E., & Reise, S.P. (2000). Item response theory for psychologists. Erlbaum.
- Feinberg, R.A., & Rubright, J.D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36 49. https://doi.org/10.1111/emip.12111
- Finkelman, M., Nering, M.L., & Roussos, L.A. (2009). A conditional exposure control method for multidimensional adaptive testing. Journal of Educational Measurement, 46(1), 84103. http://doi.org/10.1111/j.1745-3984.2009.01070.x
- Hambleton, R.K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Boston, MA: Kluwer Academic Publishers.
- He, W., & Reckase, M.D. (2014). Item pool design for an operational variable-length computerized adaptive test. Educational and Psychological Measurement, 74(3), 473-494. https://doi.org/10.1177/0013164413509629
- Hornik, K., & FAQ, R. (2010). Frequently asked questions on R. The R project for Statistical. https://CRAN.R-project.org/doc/FAQ/RFAQ.html
- Lin, H. (2012). Item selection methods in multidimensional computerized adaptive testing adopting polytomously-scored items under multidımensional generalized partial credit model [Unpublished Doctoral Dissertation, University of Illinois]. https://hdl.handle.net/2142/34534
- Lord, F.M., & Novick, M.R. (1968). Statistical theories of zihinsel test scores. Oxford.
- Magis, D., Yan, D., & von-Davier, A. (Eds.). (2017). Computerized adaptive and multistage testing with R: Using packages catr and mstr. Springer.
- Meneghetti, D.D.R., & Junior, P.T.A. (2017). Application and simulation of computerized adaptive tests through the package catsim. https:// arxiv.org/pdf/1707.03012.pdf
- Mulder, J., & van der Linden, W.J. (2009). Muldimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74(2), 273 296. https://doi.org/10.1007/s11336-008-9097-5
- Nydick, S., & Weiss, D.J. (2009). A hybrid simulation procedure for developments of CATs. In Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing. https:// www.iacat.org/sites/default/files/biblio/cat09nydick.pdf
- Paap, M.C., Born, S., & Braeken, J. (2019). Measurement efficiency for fixed-precision multidimensional computerized adaptive tests: Comparing health measurement and educational testing using example banks. Applied Psychological Measurement, 43(1), 68-83. https://doi.org/10.1177/0146621618765719
- R Core Team (2020). R: A language and environment for statistical computing [Computer software manual]. http://www.R-project.org/
- Reckase, M.D. (2009). Multidimensional item response theory: Statistics for social and behavioral sciences. Springer.
- Riggelsen, C. (2008). Learning Bayesian networks: a MAP criterion for joint selection of model structure and parameter. In 2008 Eighth IEEE International Conference on Data Mining (pp. 522-529). IEEE.
- Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika, 61(2), 331-354.
- Segall, D.O. (2001). General ability measurement: An application of multidimensional itemresponse theory. Psychometrika, 66, 79-97.
- Segall, D.O. (2005). Computerized adaptive testing. In K. Kempf-Leonard (Ed.),Encyclopedia of Social Measurement. Academic Press.
- Seo, D.G., & Weiss, D.J. (2015). Best Design for Multidimensional Adaptive Testing With the Bifactor Model. Educational and Psychological Measurement, 75(6), 954-978.
- Su, Y.H. (2016). A comparison of constrained item selection methods in multidimensional computerized adaptive testing. Applied psychological measurement, 40(5), 346-360. https://doi.org/10.1177/0146621616639305
- Team, R. (2020). RStudio: Integrated Development for R (1.3.1073) [Computer software]. RStudio. https://rstudio.com/products/rstudio/
- Thissen, D., & Mislevy, R.J., 2000. Testing algorithms. In H. Wainer (Eds.). Computerized Adaptive Testing. Lawrence Erlbaum Assc.
- Thompson, N.A. (2007). A practitioner’s guide for variable-length computerized classification testing. Practical Assessment, Research & Evaluation, 12(1), 1-13.
- Thompson, N.A., & Weiss, D.J. (2011). A framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation. 16(1). 1-9.
- Van der Linden, W., & Glas, G. A. W. (2002). Computerized adaptive testing: theory and practice. Kluwer Academic Publishers.
- Veerkamp, W.J., & Berger, M.P. (1997). Some new item selection criteria for adaptive testing. Journal of Educational and Behavioral Statistics, 22(2), 203-226. https://doi.org/10.3102/10769986022002203
- Wainer, H., Dorans, N.J., Flaugher, R., Green, B.F., Mislevy, R.J. Steinberg, L., & Thissen, D. (2000). Computerized adaptive testing: a primer. Lawrence Erlbaum.
- Wang, C., & Chen, H. (2004). Implementation and measurement efficiency of multidimensional computerized adaptive testing. Applied Psychological Measurement, 28, 295-316. https://doi.org/10.1177/0146621604265938
- Wang, C. (2015). On latent trait estimation in multidimensional compensatory item response models. Psychometrika, 80(2), 428-449. https://doi.org/10.1007/s11336-013-9399-0
- Weiss, D.J., & Kingsbury, G.G. (1984). Application of computer adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361–375.
- Weiss, D.J., & Gibbons, R.D. (2007). Computerized adaptive testing with the bifactor model. Paper presented at the New CAT Models session at the 2007 GMAC Conference on Computerized Adaptive Testing. https://mail.iacat.org/sites/default/files/biblio/cat07weiss%26gibbons.pdf
- Yao, L. (2013). Comparing the performance of five multidimensional CAT selection procedure with different stopping rules. Applied Psychological Measurement, 37(1), 3-23. https://doi.org/10.1177/0146621612455687