Can TIMSS Mathematics Assessments be Implemented as a Computerized Adaptive Test?
Abstract
In recent years, there has been a growing interest and extensive use of computerized adaptive testing (CAT) especially in large-scale assessments. Numerous simulation studies have been conducted on both real and simulated data sets to determine the optimum conditions and develop CAT versions. Being one of the most popular large-scale assessment programs, Trends in International Mathematics and Science Study (TIMSS) has been implemented as paper and pencil tests to monitor student achievement in mathematics and science at fourth and eighth grade levels since 1995. The purpose of this study is to investigate the optimum CAT algorithm for TIMSS eighth grade mathematics assessments. Since Turkey and USA participated in 2007, 2011 and 2015 administrations, their data were combined and then 393 items were calibrated on the same scale by using marginal maximum likelihood estimation method. With this item pool, several scenarios were proposed and tested to determine not only the optimum starting rule, ability estimation method, test termination rule but also the efficiency of exposure control method. The results of the study indicated that estimating abilities with expected a posteriori method after 6 random items, terminating the fixed-length test after 20 items seemed to be the optimum algorithm for TIMSS eighth grade mathematics assessments. Also, it was found that using item exposure control had a prior importance for the effective use of the item pool. This study has some implications for both national and international large-scale test developers in determining the optimum CAT algorithm and its consequences compared with paper and pencil versions.
Keywords
References
- Davey, T., & Pitoniak, M. J. (2006). Designing computerized adaptive tests. Handbook of Test Development, 543-574. Routledge.
- Eggen, T. J. H. M. (2004). Contributions to the Theory and Practice of Computerized Adaptive Testing. Dissertation. Print Partners Ipskamp B.V., Enschede.
- Eggen, T. J. H. M. (2007). Choices in CAT models in the context of educational testing. In D. J. Weiss (Ed.), Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing.
- Glas, C. A. W. (2010) MIRT: Multidimensional Item Response Theory. (Computer Software). University of Twente. Retrieved from https://www.utwente.nl/nl/bms/omd/Medewerkers/medewerkers/glas/#soft-ware
- Glas, C. A. W., & Geerlings, H. (2009). Psychometric aspects of pupil monitoring systems. Studies in Educational Evaluation, 35, 83-88.
- Gu, L., Reckase M. D. (2007). Designing optimal item pools for computerized adaptive tests with Sympson-Hetter exposure control. In D. J. Weiss (Ed.), Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing.
- Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory (Vol. 2). Sage.
- Luecht, R. M. & Sireci, S. G. (2012). A review of models for computer-based testing. Research Report RR-2011-12. New York: The College Board.
Details
Primary Language
English
Subjects
-
Journal Section
Research Article
Authors
Semirhan Gokce
*
0000-0002-4752-5598
Türkiye
Cees A.W. Glas
This is me
0000-0001-6531-5503
The Netherlands
Publication Date
December 28, 2018
Submission Date
November 25, 2018
Acceptance Date
December 21, 2018
Published in Issue
Year 2018 Volume: 9 Number: 4
Cited By
Comparison of Different Computerized Adaptive Testing Approaches with Shadow Test Under Different Test Length and Ability Estimation Method Conditions
Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi
https://doi.org/10.21031/epod.1202599