Araştırma Makalesi
BibTex RIS Kaynak Göster

The Effect of Strong and Weak Unidimensional Item Pools on Computerized Adaptive Classification Testing

Yıl 2022, Cilt: 4 Sayı: 2, 310 - 321, 31.12.2022
https://doi.org/10.51535/tell.1202804

Öz

Computerized Adaptive Classification Tests (CACT) aim to classify individuals effectively with high classification accuracy and few items over large item pools. The characteristic features of the item pool include the number of items, item factor loadings, the distribution of the Test Information Function, and dimensionality. In this study, we present the results of a comprehensive simulation study that was examined how item selection methods (MFI-KLI), ability estimation methods (EAP-WLE) and classification methods (SPRT-CI) were affected by strong and weak unidimensional item pools. Findings of the study indicate that CI had always produced results with classification accuracy similar to SPRT but with a test length of almost half. Additionally, KLI and MFI item selection methods were not affected by the item pool characteristic as weak or strong unidimensionality. From findings of this study, it can be recommended to use CI with EAP in CACT studies, whether the item pool is weak or strong unidimensional, but WLE only under strong unidimensional item pools. Additionally, EAP and SPRT methods are recommended to prefer in the weak unidimensional item pool.

Kaynakça

  • Ayan, C. (2018). Comparing the psychometric features of traditional and computerized adaptive classification test in the cognitive diagnostic model [Unpublished Doctoral Dissertation]. Ankara University.
  • Aybek, E. C. (2016). An investigation of applicability of the self assessment inventory as a computerized adaptive test (CAT) [Unpublished Doctoral Dissertation]. Ankara University.
  • Breslow, N. E., & Holubkov, R. (1997). Weighted likelihood, pseudo-likelihood and maximum likelihood methods for logistic regression analysis of two-stage data. Statistics in Medicine, 16, 103-116.
  • Cheng, P. E., & Liou, M. (2000). Estimation of trait level in computerized adaptive testing. Applied Psychological Measurement, 24(3), 257–265.
  • Demir, S. (2019). Investigation of classification accuracy at computerized adaptive classification tests [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Diao, Q., & Reckase, M. (2009). Comparison of ability estimation and item selection methods in multidimensional computerized adaptive testing. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing (pp. 1-13). UMI Research Press.
  • Dogan, N., Soysal, S., & Karaman, H. (2017). Aynı örnekleme açımlayıcı ve doğrulayıcı faktör analizi uygulanabilir mi?[ Can exploratory and confirmatory factor analysis be conducted to the same sample?]. In Ö. Demirel & S. Dinçer (Eds.), Küreselleşen dünyada eğitim [Education in a globalizing World] (pp. 373- 400). Pegema Publishing.
  • Doğruöz, E. (2018). Investigation of adaptive multistage test based on test assembly methods [Unpublished Doctoral Dissertation]. Hacetteepe University.
  • Dooley, K. (2002), Simulation research methods. In J. Baum (Ed.), Companion to organizations, (pp. 829-848). Blackwell.
  • Eggen, T. J. H. M. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23, 249-260.
  • Eggen, T. J. H. M., & Straetmans, G. J. J. M. (2000). Computerized adaptive testing for classifying examinees into three categories. Educational and Psychological Measurement, 60(5), 713-734.
  • Erdem Kara, B. (2019). The effect of item ratio indicating differential item functioning on computer adaptive and multi stage tests [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Finkelman, M. (2008). On using stochastic curtailment to shorten the SPRT in sequential mastery testing. Journal of Educational and Behavioral Statistics, 33(4), 442-463.
  • Flaugher, R. (2000). Item Pools. In Wainer, H. (Ed.), Computerized adaptive testing: A Primer (pp. 37-59). Erlbaum.
  • Gökçe, S. (2012). Comparison of linear and adaptive versions of the turkish pupil monitoring system (pms) mathematics assessment [Unpublished Doctoral Dissertation]. Middle East Technical University.
  • Gündeğer, C. (2017). A comparison of computerized adaptive classification test criteria in terms of classification accuracy and test length [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Gündeğer, C., & Doğan, N. (2018). The effects of item pool characteristics on test length and classification accuracy in computerized adaptive classification testings. Hacettepe University Journal of Education, 33(4), 888-896.
  • Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Kluwer Nijhoff Publishing.
  • Hsiehi, M. (2015). Examination of sequential probabılıty ratio tests in the setting of computerized classification tests: A simulation study. International Journal of Innovative Management, Information & Production, 6(2), 38-46.
  • Jiao, H. & Lau, A. C. (2003, April 2-24). The effects of model misfit in computerized classification test [Paper presentation]. The annual meeting of the National Council of Educational Measurement. Chicago, IL,USA
  • Kaçar, M. (2016). Investigation of maximum fisher item selection method on computerized adaptive testing [Unpublished Master Thesis]. Necmettin Erbakan, University.
  • Kalender, İ. (2011). Effects of different computerized adaptive testing strategies on recovery of ability [Unpublished Doctoral Dissertation]. Middle East Technical University.
  • Kezer, F. (2013). Comparison of the computerized adaptive testing strategies [Unpublished Doctoral Dissertation]. Ankara University.
  • Kezer, F. (2021). The effect of item pools of different strengths on the test results of computerized adaptive testing. International Journal of Assessment Tools in Education, 8(1), 145–155.
  • Lau, C. A. (1996). Robustness of a unidimensional computerized testing mastery procedure with multidimensional testing data [Unpublished Doctoral Dissertation]. University of Iowa.
  • Lau, C. A., & Wang, T. (1999, April 19-23). Computerized classification testing under practical constraints with a polytomous model [Paper presentation]. AERA Annual Meeting. Montreal, Canada.
  • Lin, C. J., & Spray, J. A. (2000). Effects of item-selection criteria on classification testing with the sequential probability ratio test (Research Report No. 2000-8). ACT.
  • Nydick, S. W. (2013). Multidimensional mastery testing with CAT [Unpublished Doctoral Dissertation]. University of Minnesota.
  • Nydick, S. W., Nozawa, Y., & Zhu, R. (2012, April 12-16). Accuracy and efficiency in classifying examinees using computerized adaptive tests: an application to a large scale test [Paper presentation]. The Annual Meeting of the National Council on Measurement in Education. Vancouver, Canada.
  • Özdemir, B. (2015). Examining the effects of ıtem level dimensionality models on multidımensional computerized adaptive testing method [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Penfield, R. D., & Bergeron, J. M. (2005). Applying a weighted maximum likelihood latent trait estimator to the generalized partial credit model. Applied Psychological Measurement, 29(3), 218–233.
  • R Core Team (2019). R: A language and environment for statistical computing [Computer software]. Vienna, Austria: R Foundation for Statistical Computing.
  • Reckase, M. D. (1983). A procedure for decision making using tailored testing. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 237-254). Academic Press.
  • Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36.
  • Seitz, N. N., & Frey, A. (2013). The sequential probability ratio test for multidimensional adaptive testing with between-item multidimensionality. Psychological Test and Assessment Modeling, 55, 105-123.
  • Spray, J. A., Abdel-fatah, A. A., Huang, C.-Y., & Lau, C. A. (1997). Unidimensional approximations for a computerized test when the item pool and latent space are multidimensional (Research Report No.97-5). ACT.
  • Spray, J. A., & Reckase, M. D. (1994). The selection of test items for decision making with a computer adaptive test [Paper presentation]. Annual Meeting of the National Council on Measurement in Education, New Orleans, USA.
  • Spray, J. A., & Reckase, M. D. (1996, April 5-7). Comparison of SPRT and sequential bayes procedures for classifying examinees into two categories using a computerized test. Journal of Educational and Behavioral Statistics, 21(4), 405-414.
  • Şahin, M. D. (2017). Examining the results of multidimensional computerized adaptive testing applications in real and generated data sets [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Şenel, S. (2017). Investigation of the compatibility of computerized adaptive testing on students with visually impaired [Unpublished Doctoral Dissertation]. Ankara University.
  • Tao, J., Shi, N. Z., & Chang, H. H. (2012). Item-weighted likelihood method for ability estimation in tests composed of both dichotomous and polytomous items. Journal of Educational and Behavioral Statistics, 37(2), 298-315.
  • Thompson, N. A., & Ro, S. (2007). Computerized classification testing with composite hypotheses. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing (pp. 1-13). UMI Research Press.
  • Thompson, N. A. (2007a). A comparison of two methods of polytomous computerized classification testing for multiple cutscores [Unpublished Doctoral Dissertation]. University of Minnesota.
  • Thompson, N. A. (2007b). A practitioner’s guide for variable-length computerized classification testing. Practical Assessment Research & Evaluation, 12(1), 1-13.
  • Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), 778-793.
  • Thompson, N. A. (2011). Termination criteria for computerized classification testing. Practical assessment. Research & Evaluation, 16(4), 1-7.
  • Wang, T. (1997, March 24-28). Essentially unbiased EAP estimates in computerized adaptive testing [Paper Presentation]. American Educational Research Association Conference. Chicago, USA.
  • Wang, T., Hanson, B. A., & Lau, C. A. (1999). Reducing bias in CAT trait estimation: A comparison of approaches. Applied Psychological Measurement, 23(3), 263-278.
  • Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35(2), 109-135.
  • Wang, S., & Wang, T. (2001). Precision of warm’s weighted likelihood estimates for a polytomous model in computerized adaptive testing. Applied Psychological Measurement, 25(4), 317–331.
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450.
  • Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361-375.
  • Wouda, J. T., & Eggen, T. J. H. M. (2009). Computerized classification testing in more than two categories by using stochastic curtailment. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing (pp. 1-13). UMI Research Press.
  • Yao, L. (2003). SimuMIRT [Computersoftware]. https://www.psychsoft.soe.vt.edu/report3.php?recordID=SimuMIRT
  • Yi, Q., Wang, T., & Ban, J. (2000). Effects of Scale Transformation and Test Termination Rule on the Precision of Ability Estimates in CAT. ACT Research Report Series, 2000-2.
Yıl 2022, Cilt: 4 Sayı: 2, 310 - 321, 31.12.2022
https://doi.org/10.51535/tell.1202804

Öz

Kaynakça

  • Ayan, C. (2018). Comparing the psychometric features of traditional and computerized adaptive classification test in the cognitive diagnostic model [Unpublished Doctoral Dissertation]. Ankara University.
  • Aybek, E. C. (2016). An investigation of applicability of the self assessment inventory as a computerized adaptive test (CAT) [Unpublished Doctoral Dissertation]. Ankara University.
  • Breslow, N. E., & Holubkov, R. (1997). Weighted likelihood, pseudo-likelihood and maximum likelihood methods for logistic regression analysis of two-stage data. Statistics in Medicine, 16, 103-116.
  • Cheng, P. E., & Liou, M. (2000). Estimation of trait level in computerized adaptive testing. Applied Psychological Measurement, 24(3), 257–265.
  • Demir, S. (2019). Investigation of classification accuracy at computerized adaptive classification tests [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Diao, Q., & Reckase, M. (2009). Comparison of ability estimation and item selection methods in multidimensional computerized adaptive testing. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing (pp. 1-13). UMI Research Press.
  • Dogan, N., Soysal, S., & Karaman, H. (2017). Aynı örnekleme açımlayıcı ve doğrulayıcı faktör analizi uygulanabilir mi?[ Can exploratory and confirmatory factor analysis be conducted to the same sample?]. In Ö. Demirel & S. Dinçer (Eds.), Küreselleşen dünyada eğitim [Education in a globalizing World] (pp. 373- 400). Pegema Publishing.
  • Doğruöz, E. (2018). Investigation of adaptive multistage test based on test assembly methods [Unpublished Doctoral Dissertation]. Hacetteepe University.
  • Dooley, K. (2002), Simulation research methods. In J. Baum (Ed.), Companion to organizations, (pp. 829-848). Blackwell.
  • Eggen, T. J. H. M. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23, 249-260.
  • Eggen, T. J. H. M., & Straetmans, G. J. J. M. (2000). Computerized adaptive testing for classifying examinees into three categories. Educational and Psychological Measurement, 60(5), 713-734.
  • Erdem Kara, B. (2019). The effect of item ratio indicating differential item functioning on computer adaptive and multi stage tests [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Finkelman, M. (2008). On using stochastic curtailment to shorten the SPRT in sequential mastery testing. Journal of Educational and Behavioral Statistics, 33(4), 442-463.
  • Flaugher, R. (2000). Item Pools. In Wainer, H. (Ed.), Computerized adaptive testing: A Primer (pp. 37-59). Erlbaum.
  • Gökçe, S. (2012). Comparison of linear and adaptive versions of the turkish pupil monitoring system (pms) mathematics assessment [Unpublished Doctoral Dissertation]. Middle East Technical University.
  • Gündeğer, C. (2017). A comparison of computerized adaptive classification test criteria in terms of classification accuracy and test length [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Gündeğer, C., & Doğan, N. (2018). The effects of item pool characteristics on test length and classification accuracy in computerized adaptive classification testings. Hacettepe University Journal of Education, 33(4), 888-896.
  • Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Kluwer Nijhoff Publishing.
  • Hsiehi, M. (2015). Examination of sequential probabılıty ratio tests in the setting of computerized classification tests: A simulation study. International Journal of Innovative Management, Information & Production, 6(2), 38-46.
  • Jiao, H. & Lau, A. C. (2003, April 2-24). The effects of model misfit in computerized classification test [Paper presentation]. The annual meeting of the National Council of Educational Measurement. Chicago, IL,USA
  • Kaçar, M. (2016). Investigation of maximum fisher item selection method on computerized adaptive testing [Unpublished Master Thesis]. Necmettin Erbakan, University.
  • Kalender, İ. (2011). Effects of different computerized adaptive testing strategies on recovery of ability [Unpublished Doctoral Dissertation]. Middle East Technical University.
  • Kezer, F. (2013). Comparison of the computerized adaptive testing strategies [Unpublished Doctoral Dissertation]. Ankara University.
  • Kezer, F. (2021). The effect of item pools of different strengths on the test results of computerized adaptive testing. International Journal of Assessment Tools in Education, 8(1), 145–155.
  • Lau, C. A. (1996). Robustness of a unidimensional computerized testing mastery procedure with multidimensional testing data [Unpublished Doctoral Dissertation]. University of Iowa.
  • Lau, C. A., & Wang, T. (1999, April 19-23). Computerized classification testing under practical constraints with a polytomous model [Paper presentation]. AERA Annual Meeting. Montreal, Canada.
  • Lin, C. J., & Spray, J. A. (2000). Effects of item-selection criteria on classification testing with the sequential probability ratio test (Research Report No. 2000-8). ACT.
  • Nydick, S. W. (2013). Multidimensional mastery testing with CAT [Unpublished Doctoral Dissertation]. University of Minnesota.
  • Nydick, S. W., Nozawa, Y., & Zhu, R. (2012, April 12-16). Accuracy and efficiency in classifying examinees using computerized adaptive tests: an application to a large scale test [Paper presentation]. The Annual Meeting of the National Council on Measurement in Education. Vancouver, Canada.
  • Özdemir, B. (2015). Examining the effects of ıtem level dimensionality models on multidımensional computerized adaptive testing method [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Penfield, R. D., & Bergeron, J. M. (2005). Applying a weighted maximum likelihood latent trait estimator to the generalized partial credit model. Applied Psychological Measurement, 29(3), 218–233.
  • R Core Team (2019). R: A language and environment for statistical computing [Computer software]. Vienna, Austria: R Foundation for Statistical Computing.
  • Reckase, M. D. (1983). A procedure for decision making using tailored testing. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing (pp. 237-254). Academic Press.
  • Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36.
  • Seitz, N. N., & Frey, A. (2013). The sequential probability ratio test for multidimensional adaptive testing with between-item multidimensionality. Psychological Test and Assessment Modeling, 55, 105-123.
  • Spray, J. A., Abdel-fatah, A. A., Huang, C.-Y., & Lau, C. A. (1997). Unidimensional approximations for a computerized test when the item pool and latent space are multidimensional (Research Report No.97-5). ACT.
  • Spray, J. A., & Reckase, M. D. (1994). The selection of test items for decision making with a computer adaptive test [Paper presentation]. Annual Meeting of the National Council on Measurement in Education, New Orleans, USA.
  • Spray, J. A., & Reckase, M. D. (1996, April 5-7). Comparison of SPRT and sequential bayes procedures for classifying examinees into two categories using a computerized test. Journal of Educational and Behavioral Statistics, 21(4), 405-414.
  • Şahin, M. D. (2017). Examining the results of multidimensional computerized adaptive testing applications in real and generated data sets [Unpublished Doctoral Dissertation]. Hacettepe University.
  • Şenel, S. (2017). Investigation of the compatibility of computerized adaptive testing on students with visually impaired [Unpublished Doctoral Dissertation]. Ankara University.
  • Tao, J., Shi, N. Z., & Chang, H. H. (2012). Item-weighted likelihood method for ability estimation in tests composed of both dichotomous and polytomous items. Journal of Educational and Behavioral Statistics, 37(2), 298-315.
  • Thompson, N. A., & Ro, S. (2007). Computerized classification testing with composite hypotheses. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing (pp. 1-13). UMI Research Press.
  • Thompson, N. A. (2007a). A comparison of two methods of polytomous computerized classification testing for multiple cutscores [Unpublished Doctoral Dissertation]. University of Minnesota.
  • Thompson, N. A. (2007b). A practitioner’s guide for variable-length computerized classification testing. Practical Assessment Research & Evaluation, 12(1), 1-13.
  • Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), 778-793.
  • Thompson, N. A. (2011). Termination criteria for computerized classification testing. Practical assessment. Research & Evaluation, 16(4), 1-7.
  • Wang, T. (1997, March 24-28). Essentially unbiased EAP estimates in computerized adaptive testing [Paper Presentation]. American Educational Research Association Conference. Chicago, USA.
  • Wang, T., Hanson, B. A., & Lau, C. A. (1999). Reducing bias in CAT trait estimation: A comparison of approaches. Applied Psychological Measurement, 23(3), 263-278.
  • Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35(2), 109-135.
  • Wang, S., & Wang, T. (2001). Precision of warm’s weighted likelihood estimates for a polytomous model in computerized adaptive testing. Applied Psychological Measurement, 25(4), 317–331.
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450.
  • Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21, 361-375.
  • Wouda, J. T., & Eggen, T. J. H. M. (2009). Computerized classification testing in more than two categories by using stochastic curtailment. In D. J. Weiss (Ed.), Proceedings of the 2009 GMAC Conference on Computerized Adaptive Testing (pp. 1-13). UMI Research Press.
  • Yao, L. (2003). SimuMIRT [Computersoftware]. https://www.psychsoft.soe.vt.edu/report3.php?recordID=SimuMIRT
  • Yi, Q., Wang, T., & Ban, J. (2000). Effects of Scale Transformation and Test Termination Rule on the Precision of Ability Estimates in CAT. ACT Research Report Series, 2000-2.
Toplam 55 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Alan Eğitimleri
Bölüm Araştırma Makaleleri
Yazarlar

Ceylan Gündeğer 0000-0003-3572-1708

Sümeyra Soysal 0000-0002-7304-1722

Yayımlanma Tarihi 31 Aralık 2022
Kabul Tarihi 4 Aralık 2022
Yayımlandığı Sayı Yıl 2022 Cilt: 4 Sayı: 2

Kaynak Göster

APA Gündeğer, C., & Soysal, S. (2022). The Effect of Strong and Weak Unidimensional Item Pools on Computerized Adaptive Classification Testing. Journal of Teacher Education and Lifelong Learning, 4(2), 310-321. https://doi.org/10.51535/tell.1202804

2617220107