Testing Multistage Testing Configurations: Post-Hoc vs. Hybrid Simulations

Volume: 7 Number: 1 May 1, 2020
  • Halil Sari
EN

Testing Multistage Testing Configurations: Post-Hoc vs. Hybrid Simulations

Abstract

Due to low cost monte-carlo MC simulations have been extensively conducted in the area of educational measurement. However, the results derived from MC studies may not always be generalizable to operational studies. The purpose of this study was to provide a methodological discussion on the other different types of simulation methods, and run post-hoc and hybrid simulations in the multistage testing environment MST , and to discuss the findings and interpretations derived from each simulation method. The real data collected via paper-pencil administrations from 652 students were used to test different MST design configurations under different test lengths. The levels of test lengths were 24-item, 36-item and 48-item. The five levels of MST designs were 1-2, 1-3, 1-2-2, 1-3-3, 1-4-4 and 1-5-5 design. Both post-hoc and hybrid simulations were run with the same group of students. All analyses were completed in R program. The results indicated that in terms of absolute bias, RMSE and pearson correlations, post-hoc and hybrid simulations generally resulted in comparable outcomes. Regardless of the simulation method, the 1-5-5 MST design was the most recommended design. Another finding was that in all simulations, as the test length increased, the outcomes were better, in general. Advantages and disadvantages of each method, and recommendations for practitioner and limitations for future research were provided in the study.

Keywords

References

  1. Barnard, J. J. (2018). From simulation to implementation: Two CAT case studies. Practical Assessment, Research & Evaluation, 23(14), 2. Retrieved from: http://pareonline.net/getvn.asp?v=23&n=14
  2. Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In R.L. Brennan (2011) Generalizability theory and classical test theory. Applied Measurement in Education. 24(1), 1-21.
  3. Bulut, O., & Kan, A. (2012) Application of computerized adaptive testing to entrance examination for graduate studies in Turkey. Egitim Arastirmalari-Eurasian Journal of Educational Research, 49, 61-80.
  4. Buuren, S. V., & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. Journal https://www.jstatsoft.org/article/view/v045i03 software, 45(3). doi
  5. 18637/jss.v045.i03. retrieved from
  6. Davey, T., & Y.H. Lee. (2011). Potential impact of context effects on the scoring and equating of the multistage GRE Revised General Test. (GRE Board Research Report 08-01). Princeton, NJ: Educational Testing Service.
  7. de Ayala, R. J. (2009). The Theory and Practice of Item Response Theory. NY: The Guilford Press.
  8. Feinberg, R. A., & Rubright, J. D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35(2), 36-49.

Details

Primary Language

English

Subjects

-

Journal Section

-

Authors

Halil Sari This is me

Publication Date

May 1, 2020

Submission Date

-

Acceptance Date

-

Published in Issue

Year 2020 Volume: 7 Number: 1

APA
Sari, H. (2020). Testing Multistage Testing Configurations: Post-Hoc vs. Hybrid Simulations. International Journal of Psychology and Educational Studies, 7(1), 27-37. https://doi.org/10.17220/ijpes.2020.01.003