EN
TR
Investigation of item parameter drift of anchor items in TIMSS practices
Abstract
This study examines item parameter drift (IPD) in eighth-grade science anchor items from the paper-based cycles of the Trends in International Mathematics and Science Study (TIMSS) in 2011, 2015, and 2019. Korea, Italy, and Saudi Arabia were selected as examples of high-, medium-, and low-achieving education systems to explore how IPD develops across different performance contexts. Using the three-parameter logistic model (3PLM) estimated in Lexter and D² statistics computed in R, the study evaluated changes in item parameters across years and country pairs for 30 multiple-choice anchor items. The results showed that eight items exhibited temporal IPD in at least one comparison period, with most drift concentrated in the 2015–2019 and 2011–2019 contrasts. Drift occurred mainly in chemistry items at the applying cognitive level, alongside a smaller number of biology and physics items. Cross-country analyses identified 18 items with IPD in at least one country pair, with the largest number of drifting items observed between Korea and Saudi Arabia. Items involving symbolic chemical notation and representational demands were especially likely to drift across systems that differ in achievement level and opportunity to learn. These findings indicate that IPD in TIMSS science is concentrated in a subset of anchor items and is shaped jointly by temporal change and cross-national differences. The study underscores the importance of routine IPD monitoring, targeted review of repeatedly drifting items, and careful anchor selection to support valid cross-cycle and cross-national comparisons in large-scale assessments.
Keywords
References
- Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716-723. https://doi.org/10.1109/TAC.1974.1100705
- Babcock, B., & Albano, A.D. (2012). Rasch scale stability in the presence of item parameter and trait drift. Applied Psychological Measurement, 36(7), 565 580. https://doi.org/10.1177/0146621612455090
- Bock, D.B, Muraki, E., & Pfeiffenberger, W. (1988). Item pool maintenance in the presence of item parameter drift. Journal of Educational Measurement, 25(4), 275 285. https://doi.org/10.1111/j.1745-3984.1988.tb00308.x
- De Ayala, R.J. (2022). The theory and practice of item response theory (2nd ed.). Guilford Press.
- Deng, H., & Melican, G. (2010). An investigation of scale drift in computer adaptive test. (Report no. 2010-2). The College Board. https://files.eric.ed.gov/fulltext/ED561045.pdf
- Fishbein, B., Martin, M.O., Mullis, I.V.S, & Foy , P. (2018). The TIMSS 2019 item equivalence study: Examining mode effects for computer-based assessment and implications for measuring trends. Large-scale Assessments in Education, 6(11), 1-23. https://doi.org/10.1186/s40536-018-0064-z
- Glas, C., & Van Buuren, N. (2021). Lexter Manual. https://shinylexter.com/
- Goldstein, H. (1983). Measuring changes in educational attainment over time: Problems and possibilities. Journal of Educational Measurement, 20(4), 369 377. https://www.jstor.org/stable/1434953
Details
Primary Language
English
Subjects
Cross-Cultural Comparisons of Education: International Examinations
Journal Section
Research Article
Early Pub Date
February 15, 2026
Publication Date
February 15, 2026
Submission Date
April 8, 2025
Acceptance Date
December 2, 2025
Published in Issue
Year 2026 Number: Advanced Online Publication