Research Article
BibTex RIS Cite

Ability Estimation with Polytomous Items in Computerized Multistage Tests

Year 2023, , 171 - 184, 30.09.2023
https://doi.org/10.21031/epod.1056079

Abstract

The aim of the study is to examine how the ability estimations of individuals change under different conditions in tests consisting of polytomous items in an computerized multistage test environment. The research is a simulation study. In the study, 108 (3x3x6x2=108) conditions were examined consisting of three categories (3, 4 and 5), three test lengths (10, 20 and 30), six panel designs (1-2, 1-2-2, 1-3, 1-3-3, 1-4 and 1-4-4) and two routing methods (Maximum Fisher Information (MFI) and Random). Simulations and analyses were carried out in the mstR package in R program, with a pool of 200 items, 1000 people and 100 replications (e.g., iterations). As the outcomes of the research, mean absolute bias, RMSE and correlation values were calculated. It was found that as the number of categories and test length increase, the mean absolute bias and RMSE values decrease, while the correlation values increase. In terms of routing methods, although MFI and random methods have similar tendencies, MFI gives better results. There is a similarity between the panel designs in terms of results.

References

  • Chen, L-Y. (2010). An investigation of the optimal test design for multi-stage test using the generalized partial credit model. [Doctoral dissertation, The University of Texas]. UT Electronic Theses and Dissertations. Retrieved from https://repositories.lib.utexas.edu/handle/2152/ETD-UT-2010-12-344
  • Donoghue, J. R. (1994). An empirical examination of the IRT information of polytomously scored reading items under the generalized partial credit model. Journal of Educational Measurement, 31(4), 295-311. https://doi.org/10.1111/j.1745-3984.1994.tb00448.x
  • Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized Adaptive Testing with Polytomous Items. Applied Psychological Measurement, 19(1), 5-22. https://doi.org/10.1177/014662169501900103
  • Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press.
  • Han, K. T. (2007). WinGen: Windows software that generates item response theory parameters and item responses. Applied Psychological Measurement, 31(5), 457-459. DOI: 10.1177/0146621607299271
  • Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26(2), 44-52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
  • ILOG. (2006). ILOG CPLEX 10.0 [User’s manual]. Paris, France: ILOG S.A. Retrieved from https://www.lix.polytechnique.fr/~liberti/teaching/xct/cplex/usrcplex.pdf
  • Kim, J., Chung, H., & Dodd, B. G. (2010, May). Comparing routing methods in the multistage test based on the partial credit model [Conference presentation]. In AERA, Denver, CO.
  • Kim, J., Chung, H., Park, R., & Dodd, B. G. (2013). A comparison of panel designs with routing methods in the multistage test with the partial credit model. Behavior Research Methods, 45, 1087–1098. https://doi.org/10.3758/s13428-013-0316-3
  • Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer‐adaptive sequential testing. Journal of Educational Measurement, 35(3), 229-249. https://doi.org/10.1111/j.1745-3984.1998.tb00537.x
  • Luecht, R. M. (2000, April). Implementing the computer-adaptive sequential testing (CAST) framework to mass produce high quality computer-adaptive and mastery tests. [Conference presentation]. In NCME, New Orleans, LA. Retrieved from https://eric.ed.gov/?id=ED442823
  • Macken-Ruiz, C. L. (2008). A comparison of multi-stage and computerized adaptive tests based on the generalized partial credit model (Publication No. 3328282) [Doctoral dissertation, The University of Texas]. ProQuest Dissertations Publishing. Retrieved from https://www.proquest.com/docview/304482829?pq-origsite=gscholar&fromopenview=true
  • Magis, D., Yan, D., von Davier, A., & Magis, M. D. (2018). Package ‘mstR’. Retrieved from https://cran.r-project.org/web/packages/mstR/mstR.pdf
  • Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. https://doi.org/10.1007/BF02296272
  • Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. ETS Research Report Series, 1992(1), i–30. https://doi.org/10.1002/j.2333-8504.1992.tb01436.x
  • Öztürk, N. B. (2019). How the Length and Characteristics of Routing Module Affect Ability Estimation in ca-MST?. Universal Journal of Educational Research, 7(1), 164-170. doi: 10.13189/ujer.2019.070121
  • R Core Team. (2018). R: A language and environment for statistical computing: R foundation for statistical computing.
  • Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34 (17). Retrieved from https://psycnet.apa.org/record/1972-04809-001
  • Sari, H. I., & Raborn, A. (2018). What Information Works Best?: A Comparison of Routing Methods. Applied psychological measurement, 42(6), 499-515. https://doi.org/10.1177/0146621617752990
  • Sari, H.I., Yahsi Sari, H., & Huggins Manley, A.C. (2016). Computer adaptive multistage testing: Practical issues, challenges and principles. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 388-406. https://doi.org/10.21031/epod.280183
  • Weiss, D. J. (1982). Improving Measurement Quality and Efficiency with Adaptive Testing. Applied Psychological Measurement, 6(4), 473–492. https://doi.org/10.1177/014662168200600408
  • Weiss, D. J. (1983). Latent trait theory and adaptive testing. In Weiss D. J. (Ed.), New horizons in testing (pp. 5-7). Academic Press.
  • Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment (Publication No. 5710) [Doctoral dissertation, University of Massachusetts Amherst]. UMass Amherst Libraries. https://scholarworks.umass.edu/dissertations_1/5710
  • Zenisky A., Hambleton R.K.,& Luecht R.M. (2009) Multistage Testing: Issues, Designs, and Research. In: van der Linden W., Glas C. (eds) Elements of Adaptive Testing. Springer.
  • Zurovac, J., Cook, T. D., Deke, J., Finucane, M. M., Chaplin, D., Coopersmith, J. S., ... & Forrow, L. V. (2021). Absolute and Relative Bias in Eight Common Observational Study Designs: Evidence from a Meta-analysis. https://arxiv.org/ftp/arxiv/papers/2111/2111.06941.pdf
Year 2023, , 171 - 184, 30.09.2023
https://doi.org/10.21031/epod.1056079

Abstract

References

  • Chen, L-Y. (2010). An investigation of the optimal test design for multi-stage test using the generalized partial credit model. [Doctoral dissertation, The University of Texas]. UT Electronic Theses and Dissertations. Retrieved from https://repositories.lib.utexas.edu/handle/2152/ETD-UT-2010-12-344
  • Donoghue, J. R. (1994). An empirical examination of the IRT information of polytomously scored reading items under the generalized partial credit model. Journal of Educational Measurement, 31(4), 295-311. https://doi.org/10.1111/j.1745-3984.1994.tb00448.x
  • Dodd, B. G., De Ayala, R. J., & Koch, W. R. (1995). Computerized Adaptive Testing with Polytomous Items. Applied Psychological Measurement, 19(1), 5-22. https://doi.org/10.1177/014662169501900103
  • Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press.
  • Han, K. T. (2007). WinGen: Windows software that generates item response theory parameters and item responses. Applied Psychological Measurement, 31(5), 457-459. DOI: 10.1177/0146621607299271
  • Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26(2), 44-52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
  • ILOG. (2006). ILOG CPLEX 10.0 [User’s manual]. Paris, France: ILOG S.A. Retrieved from https://www.lix.polytechnique.fr/~liberti/teaching/xct/cplex/usrcplex.pdf
  • Kim, J., Chung, H., & Dodd, B. G. (2010, May). Comparing routing methods in the multistage test based on the partial credit model [Conference presentation]. In AERA, Denver, CO.
  • Kim, J., Chung, H., Park, R., & Dodd, B. G. (2013). A comparison of panel designs with routing methods in the multistage test with the partial credit model. Behavior Research Methods, 45, 1087–1098. https://doi.org/10.3758/s13428-013-0316-3
  • Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer‐adaptive sequential testing. Journal of Educational Measurement, 35(3), 229-249. https://doi.org/10.1111/j.1745-3984.1998.tb00537.x
  • Luecht, R. M. (2000, April). Implementing the computer-adaptive sequential testing (CAST) framework to mass produce high quality computer-adaptive and mastery tests. [Conference presentation]. In NCME, New Orleans, LA. Retrieved from https://eric.ed.gov/?id=ED442823
  • Macken-Ruiz, C. L. (2008). A comparison of multi-stage and computerized adaptive tests based on the generalized partial credit model (Publication No. 3328282) [Doctoral dissertation, The University of Texas]. ProQuest Dissertations Publishing. Retrieved from https://www.proquest.com/docview/304482829?pq-origsite=gscholar&fromopenview=true
  • Magis, D., Yan, D., von Davier, A., & Magis, M. D. (2018). Package ‘mstR’. Retrieved from https://cran.r-project.org/web/packages/mstR/mstR.pdf
  • Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174. https://doi.org/10.1007/BF02296272
  • Muraki, E. (1992). A generalized partial credit model: Application of an EM algorithm. ETS Research Report Series, 1992(1), i–30. https://doi.org/10.1002/j.2333-8504.1992.tb01436.x
  • Öztürk, N. B. (2019). How the Length and Characteristics of Routing Module Affect Ability Estimation in ca-MST?. Universal Journal of Educational Research, 7(1), 164-170. doi: 10.13189/ujer.2019.070121
  • R Core Team. (2018). R: A language and environment for statistical computing: R foundation for statistical computing.
  • Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph Supplement, 34 (17). Retrieved from https://psycnet.apa.org/record/1972-04809-001
  • Sari, H. I., & Raborn, A. (2018). What Information Works Best?: A Comparison of Routing Methods. Applied psychological measurement, 42(6), 499-515. https://doi.org/10.1177/0146621617752990
  • Sari, H.I., Yahsi Sari, H., & Huggins Manley, A.C. (2016). Computer adaptive multistage testing: Practical issues, challenges and principles. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 388-406. https://doi.org/10.21031/epod.280183
  • Weiss, D. J. (1982). Improving Measurement Quality and Efficiency with Adaptive Testing. Applied Psychological Measurement, 6(4), 473–492. https://doi.org/10.1177/014662168200600408
  • Weiss, D. J. (1983). Latent trait theory and adaptive testing. In Weiss D. J. (Ed.), New horizons in testing (pp. 5-7). Academic Press.
  • Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment (Publication No. 5710) [Doctoral dissertation, University of Massachusetts Amherst]. UMass Amherst Libraries. https://scholarworks.umass.edu/dissertations_1/5710
  • Zenisky A., Hambleton R.K.,& Luecht R.M. (2009) Multistage Testing: Issues, Designs, and Research. In: van der Linden W., Glas C. (eds) Elements of Adaptive Testing. Springer.
  • Zurovac, J., Cook, T. D., Deke, J., Finucane, M. M., Chaplin, D., Coopersmith, J. S., ... & Forrow, L. V. (2021). Absolute and Relative Bias in Eight Common Observational Study Designs: Evidence from a Meta-analysis. https://arxiv.org/ftp/arxiv/papers/2111/2111.06941.pdf
There are 25 citations in total.

Details

Primary Language English
Subjects Testing, Assessment and Psychometrics (Other)
Journal Section Articles
Authors

Hasibe Yahsi Sarı 0000-0002-0451-6034

Hülya Kelecioğlu 0000-0002-0741-9934

Publication Date September 30, 2023
Acceptance Date December 21, 2022
Published in Issue Year 2023

Cite

APA Yahsi Sarı, H., & Kelecioğlu, H. (2023). Ability Estimation with Polytomous Items in Computerized Multistage Tests. Journal of Measurement and Evaluation in Education and Psychology, 14(3), 171-184. https://doi.org/10.21031/epod.1056079