Research Article
BibTex RIS Cite

The effect of polytomous item ratio on ability estimation in multistage tests

Year 2025, Volume: 12 Issue: 2, 414 - 429, 01.06.2025
https://doi.org/10.21449/ijate.1474855

Abstract

The aim of the study is to examine the effect of polytomous item ratio on ability estimation in different conditions in multistage tests (MST) using mixed tests. The study is simulation-based research. In the PISA 2018 application, the ability parameters of the individuals and the item pool were created by using the item parameters estimated from the dichotomous and polytomous items obtained in the field of reading skills. MST conditions; panel design, test lengths, routing methods, and polytomous item ratio. Simulation data, MST pattern and analysis were obtained with the help of WinGen, CPLEX, and the “mstR” package in the R Studio program. A total of 108 conditions and 100 replications were examined in the study. As a result of the simulations, RMSE, mean absolute bias and correlation values were calculated. As a result of the research, it is seen that when the ratio of polytomous items in the tests increases from 10% to 50%, the mean absolute bias and RMSE values decrease while the correlation values increase. As the test length increases, RMSE and mean absolute bias values decrease while correlation values increase. In terms of routing methods, MFI performed better than the NC routing method. In general, three-stage panel designs gave significantly better results than two-stage panel designs. In 1-2 and 1-4 panel designs, it does not matter which routing method is used.

References

  • Armstrong, R.D. (2002). Routing rules for multi-form structures (LSAC Computerized Testing Report No. 02 08). Law School Admission Council. https://searchworks.stanford.edu/view/6794803
  • Bock, R.D., & Mislevy, R.J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431 444. https://doi.org/10.1177/014662168200600405
  • Boztunç-Öztürk, N. (2019). How the length and characteristics of routing module affect ability estimation in CA-MST? Universal Journal of Educational Research, 7(1), 164-170. https://doi.org/10.13189/ujer.2019.070121
  • Cıkrıkcı, N., Yalçın, S., Kalender, İ., Gül, E., et al. (2020). Development of a computerized adaptive version of the Turkish Driving Licence Exam. International Journal of Assessment Tools in Education, 7(4), 570-587. https://doi.org/10.21449/ijate.716177
  • Dallas, A. (2014). The effects of routing and scoring within a computer adaptive multi-stage framework [Doctoral dissertation, University of North Carolina]. University of North Carolina Libraries. https://libres.uncg.edu/ir/uncg/f/Dallas_uncg_0154D_11394.pdf
  • Doğruöz, E. (2018). Bireyselleştirilmiş çok aşamalı testlerin test birleştirme yöntemlerine göre incelenmesi [Doktora tezi, Hacettepe Üniversitesi]. Hacettepe Üniversitesi Açık Erişim Sistemi. http://hdl.handle.net/11655/5298
  • Erdem-Kara, B. (2019). Değişen madde fonksiyonu gösteren madde oranının bireyselleştirilmiş bilgisayarlı ve çok aşamalı testler üzerindeki etkisi [Doktora tezi, Hacettepe Üniversitesi]. Hacettepe Üniversitesi Açık Erişim Sistemi. http://hdl.handle.net/11655/11968
  • Han, K.T. (2007). WinGen: Windows Software That Generates Item Response Theory Parameters and Item Responses. Applied Psychological Measurement, 31(5), 457-459. https://doi.org/10.1177/0146621607299271
  • Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26(2), 44-52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
  • Kim, J., Chung, H., & Dodd, B.G. (2010, April). Comparing routing methods in the multistage test based on the partial credit model [Conference presentation] Annual meeting of the American Educational Research Association, Denver, CO.
  • Kim, J., Chung, H., Dodd, B.G., & Park, R. (2012). Panel design variations in the multistage test using the mixed-format tests. Educational and Psychological Measurement, 72(4), 574-588. https://doi.org/10.1177/001316441142897
  • Kim, J., Chung, H., Park, R., & Dodd, B.G. (2013). A comparison of panel designs with routing methods in the multistage test with the partial credit model. Behavior Research Methods, 45(4), 1087-1098. https://doi.org/10.3758/s13428-013-0316-3
  • Kim, J., & Dodd, B.G. (2014). Mixed-format multistage tests: Issues and methods. In D. Yan, A.A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 55–67). CRC Press.
  • Kim, H., & Plake, B.S. (1993). Monte carlo simulation comparison of two-stage testing and computerized adaptive testing. Annual Meeting of the National Council on Measurement in Education, Atlanta, GA. https://eric.ed.gov/?id=ED357041
  • Luo, X., & Kim, D. (2018). A top‐down approach to designing the computerized adaptive multistage test. Journal of Educational Measurement, 55(2), 243 263. https://doi.org/10.1111/jedm.12174
  • Luecht, R.M., & Nungester, R.J. (1998). Some practical examples of computer‐adaptive sequential testing. Journal of Educational Measurement, 35(3), 229 249. http://www.jstor.org/stable/1435202
  • Magis, D., Yan, D., & von Davier A.A. (2018). Package ‘mstR’: Procedures to generate patterns under multistage testing. https://cran.r-project.org/web/packages/mstR/mstR.pdf
  • OECD (2002). Proposed standard practice for surveys on research and experimental development: Frascati Manual 2002. OECD Publishing. https://doi.org/10.1787/9789264199040-en
  • Park, R. (2015). Investigating the impact of a mixed-format item pool on optimal test designs for multistage testing [Doctoral dissertation, The University of Texas]. University of Texas Libraries. http://hdl.handle.net/2152/31011
  • Park, R., Kim, J., Chung, H., & Dodd, B.G. (2014). Enhancing pool utilization in constructing the multistage test using mixed-format tests. Applied Psychological Measurement, 38(4), 268-280. https://doi.org/10.1177/0146621613515
  • Park, R., Kim, J., Chung, H., & Dodd, B. G. (2017). The development of mst test information for the prediction of test performances. Educational and Psychological Measurement, 77(4), 570–586. https://doi.org/10.1177/0013164416662960
  • Patsula, L.N. (1999). A comparison of computerized adaptive testing and multistage testing. [Doctoral dissertation, University of Massachusetts]. University of Massachusetts Libraries. https://doi.org/10.7275/10994910
  • R Core Team. (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing. http://www.R-project.org/
  • Rosa, K., Swygert, K.A., Nelson, L., & Thissen, D. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items—scale scores for patterns of summed scores. In D. Thissen & H. Wainer (Eds.), Test scoring (1st ed., pp. 253–292). Routledge. https://doi.org/10.4324/9781410604729
  • Sahin, M.G., & Ozturk, N.B. (2019). Analyzing the maximum likelihood score estimation method with fences in ca-MST. International Journal of Assessment Tools in Education, 6(4), 555-567. https://dx.doi.org/10.21449/ijate.634091
  • Sari, H.I., & Raborn, A. (2018). What information works best?: A Comparison of Routing Methods. Applied Psychological Measurement, 42(6), 499 515. https://doi.org/10.1177/0146621617752990
  • Senel, S., & Kutlu, O. (2018). Computerized adaptive testing design for students with visual impairment. Education and Science Journal,43(194), 261 284. https://doi.org/10.15390/EB.2018.7515
  • Wang, K. (2017). A fair comparison of the performance of computerized adaptive testing and multistage adaptive testing [Doctoral dissertation, University of Michigan State]. University of Michigan State Libraries. https://doi.org/doi:10.25335/ypy5-6g68
  • Weiss, D.J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6(4), 473 492. https://doi.org/10.1177/014662168200600408
  • Weiss, D.J. (1983). Latent trait theory and adaptive testing. In Weiss D.J. (Ed.), New horizons in testing (pp. 5-7). Academic Press.
  • Weiss, D.J., & Kingsbury, G.G. (1984). Application of computer adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361 375. https://doi.org/10.1111/j.1745-3984.1984.tb01040.x
  • Weissman, A., Belov, D.I., & Armstrong, R.D. (2007). Information-based versus number-correct routing in multistage classification tests, LSAC Computerized Testing Report, 7(5). Law School Admission Council.
  • Yahsi Sari, H., & Kelecioglu, H. (2023). Ability estimation with polytomous items in computerized multistage tests. Journal of Measurement and Evaluation in Education and Psychology, 14(3), 171-184. https://doi.org/10.21031/epod.1056079
  • Yamamoto, K., Shin, H., & Khorramdel, L. (2019). Introduction of multistage adaptive testing design in PISA 2018. OECD Education Working Papers, No. 209, OECD Publishing. https://doi.org/10.1787/b9435d4b-en
  • Zenisky, A.L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment [Doctoral dissertation, University of Massachusetts]. University of Massachusetts Libraries. https://doi.org/10.7275/18739572
  • Zenisky, A.L., Hambleton, R.K., & Luecht, R.M. (2010). Multistage testing: Issues, designs, and research. In W.J. van der Linden & C.A.W. Glas (Eds.), Elements of adaptive testing (pp. 355-372). Springer.
  • Zheng, Y., Nozawa, Y., Gao, X., & Chang, H.H. (2012). Multistage adaptive testing for a large-scale classification test: The designs, automated heuristic assembly, and comparison with other testing modes. ACT Research Reports 2012 6. https://www.act.org/content/dam/act/unsecured/documents/ACT_RR2012-6.pdf
Year 2025, Volume: 12 Issue: 2, 414 - 429, 01.06.2025
https://doi.org/10.21449/ijate.1474855

Abstract

References

  • Armstrong, R.D. (2002). Routing rules for multi-form structures (LSAC Computerized Testing Report No. 02 08). Law School Admission Council. https://searchworks.stanford.edu/view/6794803
  • Bock, R.D., & Mislevy, R.J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6(4), 431 444. https://doi.org/10.1177/014662168200600405
  • Boztunç-Öztürk, N. (2019). How the length and characteristics of routing module affect ability estimation in CA-MST? Universal Journal of Educational Research, 7(1), 164-170. https://doi.org/10.13189/ujer.2019.070121
  • Cıkrıkcı, N., Yalçın, S., Kalender, İ., Gül, E., et al. (2020). Development of a computerized adaptive version of the Turkish Driving Licence Exam. International Journal of Assessment Tools in Education, 7(4), 570-587. https://doi.org/10.21449/ijate.716177
  • Dallas, A. (2014). The effects of routing and scoring within a computer adaptive multi-stage framework [Doctoral dissertation, University of North Carolina]. University of North Carolina Libraries. https://libres.uncg.edu/ir/uncg/f/Dallas_uncg_0154D_11394.pdf
  • Doğruöz, E. (2018). Bireyselleştirilmiş çok aşamalı testlerin test birleştirme yöntemlerine göre incelenmesi [Doktora tezi, Hacettepe Üniversitesi]. Hacettepe Üniversitesi Açık Erişim Sistemi. http://hdl.handle.net/11655/5298
  • Erdem-Kara, B. (2019). Değişen madde fonksiyonu gösteren madde oranının bireyselleştirilmiş bilgisayarlı ve çok aşamalı testler üzerindeki etkisi [Doktora tezi, Hacettepe Üniversitesi]. Hacettepe Üniversitesi Açık Erişim Sistemi. http://hdl.handle.net/11655/11968
  • Han, K.T. (2007). WinGen: Windows Software That Generates Item Response Theory Parameters and Item Responses. Applied Psychological Measurement, 31(5), 457-459. https://doi.org/10.1177/0146621607299271
  • Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26(2), 44-52. https://doi.org/10.1111/j.1745-3992.2007.00093.x
  • Kim, J., Chung, H., & Dodd, B.G. (2010, April). Comparing routing methods in the multistage test based on the partial credit model [Conference presentation] Annual meeting of the American Educational Research Association, Denver, CO.
  • Kim, J., Chung, H., Dodd, B.G., & Park, R. (2012). Panel design variations in the multistage test using the mixed-format tests. Educational and Psychological Measurement, 72(4), 574-588. https://doi.org/10.1177/001316441142897
  • Kim, J., Chung, H., Park, R., & Dodd, B.G. (2013). A comparison of panel designs with routing methods in the multistage test with the partial credit model. Behavior Research Methods, 45(4), 1087-1098. https://doi.org/10.3758/s13428-013-0316-3
  • Kim, J., & Dodd, B.G. (2014). Mixed-format multistage tests: Issues and methods. In D. Yan, A.A. von Davier, & C. Lewis (Eds.), Computerized multistage testing: Theory and applications (pp. 55–67). CRC Press.
  • Kim, H., & Plake, B.S. (1993). Monte carlo simulation comparison of two-stage testing and computerized adaptive testing. Annual Meeting of the National Council on Measurement in Education, Atlanta, GA. https://eric.ed.gov/?id=ED357041
  • Luo, X., & Kim, D. (2018). A top‐down approach to designing the computerized adaptive multistage test. Journal of Educational Measurement, 55(2), 243 263. https://doi.org/10.1111/jedm.12174
  • Luecht, R.M., & Nungester, R.J. (1998). Some practical examples of computer‐adaptive sequential testing. Journal of Educational Measurement, 35(3), 229 249. http://www.jstor.org/stable/1435202
  • Magis, D., Yan, D., & von Davier A.A. (2018). Package ‘mstR’: Procedures to generate patterns under multistage testing. https://cran.r-project.org/web/packages/mstR/mstR.pdf
  • OECD (2002). Proposed standard practice for surveys on research and experimental development: Frascati Manual 2002. OECD Publishing. https://doi.org/10.1787/9789264199040-en
  • Park, R. (2015). Investigating the impact of a mixed-format item pool on optimal test designs for multistage testing [Doctoral dissertation, The University of Texas]. University of Texas Libraries. http://hdl.handle.net/2152/31011
  • Park, R., Kim, J., Chung, H., & Dodd, B.G. (2014). Enhancing pool utilization in constructing the multistage test using mixed-format tests. Applied Psychological Measurement, 38(4), 268-280. https://doi.org/10.1177/0146621613515
  • Park, R., Kim, J., Chung, H., & Dodd, B. G. (2017). The development of mst test information for the prediction of test performances. Educational and Psychological Measurement, 77(4), 570–586. https://doi.org/10.1177/0013164416662960
  • Patsula, L.N. (1999). A comparison of computerized adaptive testing and multistage testing. [Doctoral dissertation, University of Massachusetts]. University of Massachusetts Libraries. https://doi.org/10.7275/10994910
  • R Core Team. (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing. http://www.R-project.org/
  • Rosa, K., Swygert, K.A., Nelson, L., & Thissen, D. (2001). Item response theory applied to combinations of multiple-choice and constructed-response items—scale scores for patterns of summed scores. In D. Thissen & H. Wainer (Eds.), Test scoring (1st ed., pp. 253–292). Routledge. https://doi.org/10.4324/9781410604729
  • Sahin, M.G., & Ozturk, N.B. (2019). Analyzing the maximum likelihood score estimation method with fences in ca-MST. International Journal of Assessment Tools in Education, 6(4), 555-567. https://dx.doi.org/10.21449/ijate.634091
  • Sari, H.I., & Raborn, A. (2018). What information works best?: A Comparison of Routing Methods. Applied Psychological Measurement, 42(6), 499 515. https://doi.org/10.1177/0146621617752990
  • Senel, S., & Kutlu, O. (2018). Computerized adaptive testing design for students with visual impairment. Education and Science Journal,43(194), 261 284. https://doi.org/10.15390/EB.2018.7515
  • Wang, K. (2017). A fair comparison of the performance of computerized adaptive testing and multistage adaptive testing [Doctoral dissertation, University of Michigan State]. University of Michigan State Libraries. https://doi.org/doi:10.25335/ypy5-6g68
  • Weiss, D.J. (1982). Improving measurement quality and efficiency with adaptive testing. Applied Psychological Measurement, 6(4), 473 492. https://doi.org/10.1177/014662168200600408
  • Weiss, D.J. (1983). Latent trait theory and adaptive testing. In Weiss D.J. (Ed.), New horizons in testing (pp. 5-7). Academic Press.
  • Weiss, D.J., & Kingsbury, G.G. (1984). Application of computer adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361 375. https://doi.org/10.1111/j.1745-3984.1984.tb01040.x
  • Weissman, A., Belov, D.I., & Armstrong, R.D. (2007). Information-based versus number-correct routing in multistage classification tests, LSAC Computerized Testing Report, 7(5). Law School Admission Council.
  • Yahsi Sari, H., & Kelecioglu, H. (2023). Ability estimation with polytomous items in computerized multistage tests. Journal of Measurement and Evaluation in Education and Psychology, 14(3), 171-184. https://doi.org/10.21031/epod.1056079
  • Yamamoto, K., Shin, H., & Khorramdel, L. (2019). Introduction of multistage adaptive testing design in PISA 2018. OECD Education Working Papers, No. 209, OECD Publishing. https://doi.org/10.1787/b9435d4b-en
  • Zenisky, A.L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment [Doctoral dissertation, University of Massachusetts]. University of Massachusetts Libraries. https://doi.org/10.7275/18739572
  • Zenisky, A.L., Hambleton, R.K., & Luecht, R.M. (2010). Multistage testing: Issues, designs, and research. In W.J. van der Linden & C.A.W. Glas (Eds.), Elements of adaptive testing (pp. 355-372). Springer.
  • Zheng, Y., Nozawa, Y., Gao, X., & Chang, H.H. (2012). Multistage adaptive testing for a large-scale classification test: The designs, automated heuristic assembly, and comparison with other testing modes. ACT Research Reports 2012 6. https://www.act.org/content/dam/act/unsecured/documents/ACT_RR2012-6.pdf
There are 37 citations in total.

Details

Primary Language English
Subjects Computer Based Exam Applications
Journal Section Articles
Authors

Hasibe Yahsi Sari 0000-0002-0451-6034

Hülya Kelecioğlu 0000-0002-0741-9934

Early Pub Date May 1, 2025
Publication Date June 1, 2025
Submission Date April 29, 2024
Acceptance Date March 6, 2025
Published in Issue Year 2025 Volume: 12 Issue: 2

Cite

APA Yahsi Sari, H., & Kelecioğlu, H. (2025). The effect of polytomous item ratio on ability estimation in multistage tests. International Journal of Assessment Tools in Education, 12(2), 414-429. https://doi.org/10.21449/ijate.1474855

23823             23825             23824