Research Article

Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests

Volume: 33 Number: 3 July 25, 2025
TR EN

Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests

Abstract

Purpose: This study examines the test equating performance under various item response theory models and sample size conditions in testlet based tests. Design/Methodology/Approach: Utilizing data from the eTIMSS 2019 science test, the study compares scale transformation methods and test equating results under Unidimensional Item Response Theory (UIRT), Testlet Response Theory (TRT) and bifactor models with varying sample sizes. Scale transformation methods, including the mean-sigma and Stocking-Lord methods, as well as observed and true score equating methods, were employed within the framework of a common-item nonequivalent groups design. To evaluate the equating performance, RMSE and BIAS values were calculated. Findings: The findings indicate that in a science test with low testlet effects, scale transformation results based on the UIRT model and test equating results based on the bifactor model demonstrated lower error rates. Moreover, as sample size increased, the error in parameter estimations generally decreased, with the TRT model specifically requiring a sample size of at least 500 for robust estimations. Highlights: The bifactor model, taking testlet effects into account, yielded more precise and consistent results, facilitating fair and reliable score equating. This study, utilizing real data, concretely illustrates the practical implications of testlet effects in tests containing testlets.

Keywords

Testlet, Testlet effect, Test equating, Testlet Response Theory

References

  1. Asriadi M., H. (2023). Equating of standardized science subjects tests using various methods: which is the most profitable? Thabiea : Journal of Natural Science Teaching, 6(1), 51-64.
  2. Atalay Kabasakal, K. (2014). Değişen madde fonksiyonunun test eşitlemeye etkisi [Doktora tezi]. Hacettepe Üniversitesi.
  3. Babcock, B., & Hodge, K. J. (2020). Rasch versus classical equating in the context of small sample sizes. Educational and Psychological Measurement, 80(3), 499-521. https://doi.org/10.1177/0013164419878
  4. Baker, F. B. & Al-Karni, A. (1991). A comparison of two procedures for computing IRT equating coefficients. Journal of Educational Measurement, 28(2), 147-162 https://doi.org/10.1111/j.1745-3984.1991.tb00350.x
  5. Bradlow, E. T., Wainer, H., & Wang, X. (1999). A Bayesian random effects model for testlets. Psychometrika, 64(2), 153–168. https://doi.org/10.1007/BF02294533
  6. Büyüköztürk, Ş., Kılıç Çakmak, E., Akgün, Ö. E., Karadeniz, Ş., & Demirel, F. (2020). Eğitimde bilimsel araştırma yöntemleri. Pegem Akademi.
  7. Cai, L., Yang, J. S., & Hansen, M. (2011). Generalized full-information item bifactor analysis. Psychol Methods, 16(3), 221–248. 10.1037/a0023350
  8. Cao, Y., Lu, R., & Tao, W. (2014). Effect of item response theory (IRT) model selection on testlet- based test equating (ETS Research Report No. RR-14-19). Educational Testing Service. https://doi.org/10.1002/ets2.12017
  9. Chalmers, R. P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1–29. https://doi.org/10.18637/jss.v048.i06
  10. Chen, J. (2014). Model selection for IRT equating of testlet-based tests in the random groups design [Doctoral dissertation] The University of Iowa. ProQuest Dissertations Publishing.
APA
Dilek, H., Atalay Kabasakal, K., & Gören, S. (2025). Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests. Kastamonu Education Journal, 33(3), 658-671. https://doi.org/10.24106/kefdergi.1750267
AMA
1.Dilek H, Atalay Kabasakal K, Gören S. Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests. Kastamonu Education Journal. 2025;33(3):658-671. doi:10.24106/kefdergi.1750267
Chicago
Dilek, Harun, Kübra Atalay Kabasakal, and Sebahat Gören. 2025. “Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests”. Kastamonu Education Journal 33 (3): 658-71. https://doi.org/10.24106/kefdergi.1750267.
EndNote
Dilek H, Atalay Kabasakal K, Gören S (July 1, 2025) Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests. Kastamonu Education Journal 33 3 658–671.
IEEE
[1]H. Dilek, K. Atalay Kabasakal, and S. Gören, “Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests”, Kastamonu Education Journal, vol. 33, no. 3, pp. 658–671, July 2025, doi: 10.24106/kefdergi.1750267.
ISNAD
Dilek, Harun - Atalay Kabasakal, Kübra - Gören, Sebahat. “Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests”. Kastamonu Education Journal 33/3 (July 1, 2025): 658-671. https://doi.org/10.24106/kefdergi.1750267.
JAMA
1.Dilek H, Atalay Kabasakal K, Gören S. Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests. Kastamonu Education Journal. 2025;33:658–671.
MLA
Dilek, Harun, et al. “Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests”. Kastamonu Education Journal, vol. 33, no. 3, July 2025, pp. 658-71, doi:10.24106/kefdergi.1750267.
Vancouver
1.Harun Dilek, Kübra Atalay Kabasakal, Sebahat Gören. Examination of Scale Transformation and Test Equating Methods in Testlet Based Tests. Kastamonu Education Journal. 2025 Jul. 1;33(3):658-71. doi:10.24106/kefdergi.1750267