Investigation of Measurement Precision and Test Lengths in Computerized Adaptive Tests in Different Conditions

Hüseyin Yıldız; Ceren Tunaboylu; Süleyman Ülkü; Gamze Giray; Hülya Kelecioğlu

doi:10.21031/epod.1068572

Araştırma Makalesi

Investigation of Measurement Precision and Test Lengths in Computerized Adaptive Tests in Different Conditions

Yıl 2024, Cilt: 15 Sayı: 1, 5 - 17, 31.03.2024

Hüseyin Yıldız Ceren Tunaboylu Süleyman Ülkü Gamze Giray Hülya Kelecioğlu

https://doi.org/10.21031/epod.1068572

Öz

In this study, it is aimed to examine item exposure rate, content balancing, and ability estimation in terms of termination rules with regard to testing lengths and testing accuracy in computerized adaptive testing. In this context, EAP and MLE ability estimation methods were compared with 1, 2 and 4 group content balancing pattern; 0.50, 0.75 and 1.00 exposure rate; it was compared with a total of 72 different conditions, including 0.35 and 0.40 standard error-based and the termination rule based on the test length of 15 and 30, was compared to correlation, bias, RMSE and test length. The production and analysis of the data were performed in the R program. As a result, the best performance in the measurement is a fixed test length of 30 items with 0.35 standard error; in group 1 pattern where the content balancing is not a group limitation; the exposure rate was displayed in the range of 0.75 and 1. Depending on the test length of ability estimation methods, scope balancing patterns and exposure rates, the number of items changes in the range of 22 and 25; Based on the termination rule, it was estimated that at least 0.40 standard errors with a standard error based on 39 items.

Anahtar Kelimeler

computerized adaptive testing, contend balancing, exposure rate., simulation study

Kaynakça

Aybek, E., & Çıkrıkçı, R. (2018). Kendini Değerlendirme Envanteri’nin Bilgisayar Ortamında Bireye Uyarlanmış Test Olarak Uygulanabilirliği. Türk Psikolojik Danışma ve Rehberlik Dergisi, 117-141.
Babcock, B. & Weiss, D.J. (2012). Termination criteria in Computerized Adaptive Tests: do variable-length CAT’s provide efficient and effective measurement? International Association for Computerized Adaptive Testing, 1, 1-18. Boyd, M. A. (2003). Strategies for Controlling Testlet Exposure Rates in Computerized Adaptive Testing Systems. Unpublished Doctoral Thesis, The University of Texas, Austin.
Boyd, A. M., Dodd, B., & Fitzpatrick, S. (2013). A Comparison of Exposure Control Procedures in CAT Systems Based on Different Measurement Models for Testlets. Applied Measurement in Education, 113–135.
Chen, J.-H., Chao, H.-Y., & Chen, S.-Y. (2019). A Dynamic Stratification Method for Improving Trait Estimation in Computerized Adaptive Testing Under Item Exposure Control. Applied Psychological Measurement, 1-15.
Davis, L. L. (2002). Strategies for Controlling Item Exposure in Computerized Adaptive Testing with Polytomously Scored Items. Unpublished Doctoral Thesis, The University of Texas, Austin.
Davis, L. L., & Dodd, B. G. (2005). Strategies for Controlling Item Exposure in Computerized Adaptive Testing with Partial Credit Model. Pearson Educational Measurement.
Choe, E., Kern, J., & Chang, H.-H. (2017). Optimizing the Use of Response Times for Item Selection in Computerized Adaptive Testing. Journal of Educational and Behavioral Statistics, 1-24.
Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. Taylor & Francis.
Eroğlu, M. G., ve Kelecioğlu, H. (2012). Bireyselleştirilmiş Bilgisayarlı Test Uygulamalarında Farklı Sonlandırma Kurallarının Ölçme Kesinliği ve Test Uzunluğu Açısından Karşılaştırılması. Uludağ Üniversitesi Eğitim Fakültesi Dergisi, 28(1), 31-52.
Flaugher, R. (2000). Item Pools. H. Wainer içinde, Computerized Adaptive Testing: A Primer Second Edition (s. 37-59). New Jersey: Lawrence Erlbaum Associates, Publishers.
Fraenkel, J., & Wallen, N. (2011). How to design and evaluate research in education (6th ed.). New York: McGraw-Hill, Inc.
Kalender, İ. (2011). Effects of different Computerized Adaptive Testing strategied on recovery of abilitiy. Doctoral Disertation. Middle East Technical University.
Kezer, F. (2013). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması. Yayınlamamış Doktora Tezi. Ankara: Ankara Üniversitesi Eğitim Bilimleri Enstitüsü.
Lee, M. (2014). Application of higher-order IRT models and hierarchical IRT models to computerized adaptive testing (Unpublished doctoral dissertation). University of California, Los Angeles.
Magis, D., & Raiche, G. (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R package catR. Journal of Statistical Software, 1-31.
Özbaşı, D., ve Demirtaşlı, N. (2015). Bilgisayar Okuryazarlığı Testinin Bilgisayar Ortamında Bireye Uyarlanmış Test Olarak Geliştirilmesi. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6(2), 218-237.
Pastor, D. A., Dodd, B. G., & Chang, H.-H. (2002). A Comparison of Item Selection Techniques and Exposure Control Mechanisms in CATs Using the Generalized Partial Credit Model. Applied Psychological Measurement , 147-163.
Reckase, M. D. (2009). Designing item pools to optimize the functioning of a computerized adaptive test. Psychological Test and Assessment Modeling, 127-141. Sulak, S., & Kelecioğlu, H. (2019). Investigation of Item Selection Methods According to Test Termination Rules in CAT Applications. Journal of Measurement and Evaluation in Education and Psychology, 315-326.
Thompson, N. A., & Weiss, D. J. (2011). A Framework for the Development of Computerized Adaptive Tests. Practical Assessment, Research & Evaluation., 1- 9.
Yao, L. (2013). Comparing the Performance of Five Multidimensional CAT Selection Procedures With Different Stopping Rules. Applied Psychological Measurement, 3-23.
Wainer, H. (2000). Introduction and history. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2. baskı., p. 1–22). Mahwah N.J.: Lawrence Erlbaum Associates
Weiss, D. J. (2004). Computerized Adaptive Testing for Effective and Efficient Measurement in Counseling and Education. Measurement and Evaluation in Counseling and Development, 70-84.

Yıl 2024, Cilt: 15 Sayı: 1, 5 - 17, 31.03.2024

Hüseyin Yıldız Ceren Tunaboylu Süleyman Ülkü Gamze Giray Hülya Kelecioğlu

https://doi.org/10.21031/epod.1068572

Öz

Kaynakça

Aybek, E., & Çıkrıkçı, R. (2018). Kendini Değerlendirme Envanteri’nin Bilgisayar Ortamında Bireye Uyarlanmış Test Olarak Uygulanabilirliği. Türk Psikolojik Danışma ve Rehberlik Dergisi, 117-141.
Babcock, B. & Weiss, D.J. (2012). Termination criteria in Computerized Adaptive Tests: do variable-length CAT’s provide efficient and effective measurement? International Association for Computerized Adaptive Testing, 1, 1-18. Boyd, M. A. (2003). Strategies for Controlling Testlet Exposure Rates in Computerized Adaptive Testing Systems. Unpublished Doctoral Thesis, The University of Texas, Austin.
Boyd, A. M., Dodd, B., & Fitzpatrick, S. (2013). A Comparison of Exposure Control Procedures in CAT Systems Based on Different Measurement Models for Testlets. Applied Measurement in Education, 113–135.
Chen, J.-H., Chao, H.-Y., & Chen, S.-Y. (2019). A Dynamic Stratification Method for Improving Trait Estimation in Computerized Adaptive Testing Under Item Exposure Control. Applied Psychological Measurement, 1-15.
Davis, L. L. (2002). Strategies for Controlling Item Exposure in Computerized Adaptive Testing with Polytomously Scored Items. Unpublished Doctoral Thesis, The University of Texas, Austin.
Davis, L. L., & Dodd, B. G. (2005). Strategies for Controlling Item Exposure in Computerized Adaptive Testing with Partial Credit Model. Pearson Educational Measurement.
Choe, E., Kern, J., & Chang, H.-H. (2017). Optimizing the Use of Response Times for Item Selection in Computerized Adaptive Testing. Journal of Educational and Behavioral Statistics, 1-24.
Embretson, S. E., & Reise, S. P. (2000). Item Response Theory for Psychologists. Taylor & Francis.
Eroğlu, M. G., ve Kelecioğlu, H. (2012). Bireyselleştirilmiş Bilgisayarlı Test Uygulamalarında Farklı Sonlandırma Kurallarının Ölçme Kesinliği ve Test Uzunluğu Açısından Karşılaştırılması. Uludağ Üniversitesi Eğitim Fakültesi Dergisi, 28(1), 31-52.
Flaugher, R. (2000). Item Pools. H. Wainer içinde, Computerized Adaptive Testing: A Primer Second Edition (s. 37-59). New Jersey: Lawrence Erlbaum Associates, Publishers.
Fraenkel, J., & Wallen, N. (2011). How to design and evaluate research in education (6th ed.). New York: McGraw-Hill, Inc.
Kalender, İ. (2011). Effects of different Computerized Adaptive Testing strategied on recovery of abilitiy. Doctoral Disertation. Middle East Technical University.
Kezer, F. (2013). Bilgisayar ortamında bireye uyarlanmış test stratejilerinin karşılaştırılması. Yayınlamamış Doktora Tezi. Ankara: Ankara Üniversitesi Eğitim Bilimleri Enstitüsü.
Lee, M. (2014). Application of higher-order IRT models and hierarchical IRT models to computerized adaptive testing (Unpublished doctoral dissertation). University of California, Los Angeles.
Magis, D., & Raiche, G. (2012). Random Generation of Response Patterns under Computerized Adaptive Testing with the R package catR. Journal of Statistical Software, 1-31.
Özbaşı, D., ve Demirtaşlı, N. (2015). Bilgisayar Okuryazarlığı Testinin Bilgisayar Ortamında Bireye Uyarlanmış Test Olarak Geliştirilmesi. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6(2), 218-237.
Pastor, D. A., Dodd, B. G., & Chang, H.-H. (2002). A Comparison of Item Selection Techniques and Exposure Control Mechanisms in CATs Using the Generalized Partial Credit Model. Applied Psychological Measurement , 147-163.
Reckase, M. D. (2009). Designing item pools to optimize the functioning of a computerized adaptive test. Psychological Test and Assessment Modeling, 127-141. Sulak, S., & Kelecioğlu, H. (2019). Investigation of Item Selection Methods According to Test Termination Rules in CAT Applications. Journal of Measurement and Evaluation in Education and Psychology, 315-326.
Thompson, N. A., & Weiss, D. J. (2011). A Framework for the Development of Computerized Adaptive Tests. Practical Assessment, Research & Evaluation., 1- 9.
Yao, L. (2013). Comparing the Performance of Five Multidimensional CAT Selection Procedures With Different Stopping Rules. Applied Psychological Measurement, 3-23.
Wainer, H. (2000). Introduction and history. In H. Wainer (Ed.), Computerized adaptive testing: A primer (2. baskı., p. 1–22). Mahwah N.J.: Lawrence Erlbaum Associates
Weiss, D. J. (2004). Computerized Adaptive Testing for Effective and Efficient Measurement in Counseling and Education. Measurement and Evaluation in Counseling and Development, 70-84.

Toplam 22 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Bölüm	Makaleler
Yazarlar	Hüseyin Yıldız 0000-0003-2387-263X Ceren Tunaboylu 0000-0001-8090-8913 Süleyman Ülkü 0000-0003-1965-0671 Gamze Giray Bu kişi benim 0000-0002-5795-4521 Hülya Kelecioğlu 0000-0002-0741-9934
Yayımlanma Tarihi	31 Mart 2024
Kabul Tarihi	1 Haziran 2023
Yayımlandığı Sayı	Yıl 2024 Cilt: 15 Sayı: 1

Kaynak Göster

APA	Yıldız, H., Tunaboylu, C., Ülkü, S., Giray, G., vd. (2024). Investigation of Measurement Precision and Test Lengths in Computerized Adaptive Tests in Different Conditions. Journal of Measurement and Evaluation in Education and Psychology, 15(1), 5-17. https://doi.org/10.21031/epod.1068572

Kapak Resmi İndir

Makale Dosyaları

Tam Metin