Analyzing Different Module Characteristics in Computer Adaptive Multistage Testing

Melek Gülşah Şahin

doi:10.21449/ijate.676947

Research Article

Analyzing Different Module Characteristics in Computer Adaptive Multistage Testing

Year 2020, Volume: 7 Issue: 2, 191 - 206, 13.06.2020

Melek Gülşah Şahin

https://doi.org/10.21449/ijate.676947

Cited By: 4

Abstract

Computer Adaptive Multistage Testing (ca-MST), which take the advantage of computer technology and adaptive test form, are widely used, and are now a popular issue of assessment and evaluation. This study aims at analyzing the effect of different panel designs, module lengths, and different sequence of a parameter value across stages and change in b parameter range on measurement precision in ca-MST implementations. The study has been carried out as a simulation. MSTGen simulation software tool was used for that purpose. 5000 simulees derived from normal distribution (N (0,1)) were simulated. 60 different conditions (two panel designs (1-3-3; 1-2-2), three module lengths (10-15-20), 5 different a parameter sequences (“0.8; 0.8; 0.8” - “1.4; 0.8; 0.8”-“0.8;1.4; 0.8” - “0.8; 0.8;1,4” - “1.4; 1,4; 1.4”) and two b parameter difference (small; large) conditions) were taken into consideration during analysis. Correlation, RMSE and AAD values of conditions were calculated. Conditional RMSE values corresponding to each ability level are given in a graph. Dissimilar to other studies in the literature, this study examines b parameter difference condition in three-stage tests and its interaction with a parameter sequence. Study results show that measurement precision increases as the number and length of the modules increase. Errors in measurement decrease as item discrimination values increase in all stages. Including items with a high value of item discrimination in the second or last stage contributes to measurement precision. In extreme ability levels, large difficulty difference condition produces lower error values when compared to small difficulty difference condition.

Keywords

ca-MST , Panel design , Module Length , Item discrimination sequence , Difficulty difference condition

References

Boztunç Öztürk, N. (2019). How the length and characteristics of routing module affect ability estimation in ca-MST?. Universal Journal of Educational Research, 7(1), 164-170. https://doi.org/10.13189/ujer.2019.070121
Chang, H., & Ying, Z. (1999). a-Stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211 222. https://doi.org/10.1177/01466219922031338
Chen, L. Y. (2010). An investigation of the optimal test design for multi-stage test using the generalized partical credit model (unpublished doctoral dissertation). The University of Texas at Austin. Retrieved from https://repositories.lib.utexas.edu/handle/2152/ETD-UT-2010-12-344
Hadadi, A., & Leucht, R. M. (1998). Some methods for detecting and understanding test speededness on timed multiple-choice tests. Academic Medicine, 73, 47-50. https://journals.lww.com/academicmedicine/Citation/1998/10000/TESTING_THE_TEST__Some_Methods_for_Detecting_and.42.aspx
Hambleton, R.K., & Xing, D. (2006). Optimal and nonoptimal computer-based test designs for making pass-fail decisions. Applied Measurement in Education, 19(3), 221-229. https://doi.org/10.1207/s15324818ame1903_4
Han, K. T. (2013). MSTGen: simulated data generator for multistage testing. Applied Psychological Measurement, 37(8), 666 668. https://doi.org/10.1177/0146621613499639
Han, K. T., & Guo, F. (2013). An approach to assembling optimal multistage testing modules on the fly (Report No. RR-13-01). Virginia: GMAC.
Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26, 44 52. https://doi.org/10.1111/j.1745 3992.2007.00093.x
Jodoin, M. G., Zenisky, A., & Hambleton, R. K. (2006). Comparison of the psychometric properties of several computer-based test design for credentialing exams with multiple purposes. Applied Measurement in Education, 19(3), 203-220. http://doi.org/10.1207/ s15324818ame1903_3
Kim, H., & Plake, B.S. (1993, April). Monte carlo simulation of two-stage testing and computerized adaptive testing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Atlanta.
Kim, J., Chung, H., Dodd, B.G., & Park, R. (2012). Panel design variations in the multistage test using the mixed-format tests. Educational and Psychological Measurement, 72(4), 574-588. https://doi.org/10.1177/0013164411428977
Kim, S., Moses, T., & Yoo, H. (2015). A comparison of IRT proficiency estimation methods uder adaptive multistage testing. Journal of Educational Measurement, 52(1), 70-79. https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12063
Kim, S., & Moses, T. (2014). An investigation of the impact of misrouting under two-stage multistage testing: A simulation study (Report No. RR-14-01). Princeton, NJ: English Testing Service.
Leucht, R. M. (2000, April) Implementing the computer-adaptive sequential testing (CAST) framework to mass produce high quality computer-adaptive and mastery tests. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.
Leucht, R., Brumfield, T., & Brithaupt K. (2006). A testlet assembly design for adaptive multistage tests. Applied Mesurement in Education, 19(3), 189 202. https://doi/abs/10.1207/s15324818ame1903_2
Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer-adaptive sequential testing. Journal of Educational Measurement, 35(3), 229-249.
Leucht, R. M., & Sireci, S. G. (2011). A review of models for computer-based testing. (Report No. RR-2011-12). New York: CollegeBoard. https://doi.org/10.1111/j.1745-3984.1998.tb00537.x
Lord, F. M. (1971). A theoretical study of two-stage testing. Psychometrika, 36(3), 227-242. https://doi.org/10.1007/BF02297844
Loyd, B. (1984, February). Efficiency and Precision in two-stage adaptive testing. Paper presented at the Annual Meeting of Eastern Educational Research Associaton, West Palm Beach, Fl.
Park, R. (2015). Investigating the impact of a mixed-format item pool on optimal test designs for multistage testing (Doctoral dissertation). The University of Texas at Austin. Retrieved from https://repositories.lib.utexas.edu/handle/2152/31011
Patsula, L.N. (1999). A comparison of computerized adaptive testing and multistage testing. (Doctoral dissertation). The University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=4283&context=dissertations_1
Rotou, O., Patsula, L., Steffen, M., & Rizavi, S. (2007, March). Comparison of multistage tests with computerized adaptive and paper and pencil tests. (Report No: RR-07-04). Princeton, NJ: English Testing Service.
Sarı, H.İ., Yahşi Sarı, H., & Huggins Manley, A.C. (2016). Computer adaptive multistage testing: Practical issues, challenges and principles. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 388 406. https://doi.org/10.21031/epod.280183
Schnipke, D.L., & Reese, L.M. (1999). A comparison of testlet-based test designs for computerized adaptive testing. (Report No: 97-01). Princeton, NJ: Law School Admission Council.
Sadeghi, K., & Khonbi, Z.A. (2017). An overview of differential item functioning in multistage computer adaptive testing using three-parameter logistic item response theory. Language testing in Asia, 7(1), 1-16. https://doi.org/10.1186/s40468-017-0038-z
van der Linden, W.J. (2005). Lineer models for optimal test design. New York: Springer.
Verschoor, A., & Eggen, T. (2014) Optimizing the test assembly and routing for multistage testing. In D. Yan., A. A. von Davier, & C., Lewis, (Ed.), Computerized Multistage Testing Theory and Applications (pp:135-150). Taylor & Francis Group.
Xing, D., & Hambleton, R., K. (2004). Impact of test design, item quality, and item bank size on the psychometric properties of computer-based credentialing examinations. Educational and Psychological Measurement, 64(1), 5 21. https://doi.org/10.1177/0013164403258393
Zeng, W. (2016). Making test batteries adaptive by using multistage testing techniques (Doctoral dissertation). The University of Wisconsin-Milwaukee. Retrieved from https://dc.uwm.edu/cgi/viewcontent.cgi?article=2241&context=etd
Zenisky, A.L., & Hambleton, R., K. (2014). Multistage test desing: Moving research results into practice. In D. Yan., A. A. von Davier, & C., Lewis, (Ed.), Computerized Multistage Testing Theory and Applications (pp. 21-37). Taylor and Francis Group.
Zenisky, A., L., & Jodoin, M., G. (1999). Current and future research in multistage testing. (Report No:370). Amherst, MA: University of Massachusetts School of Education.
Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment. (Doctoral dissertation). University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/dissertations/AAI3136800
Zheng, Y., Nozawa, Y., Gao, X., & Chang, H. (2012). Multistage adaptive testing for a large-scale classification test: Design, heuristic assembly, and comparison with other testing modes. (Report No:2012-6). Iowa City, IA.: ACT.
Zheng, Y., & Chang, H. (2015). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39(2), 104 118. https://doi.org/10.1177/0146621614544519

Analyzing Different Module Characteristics in Computer Adaptive Multistage Testing

Year 2020, Volume: 7 Issue: 2, 191 - 206, 13.06.2020

Melek Gülşah Şahin

https://doi.org/10.21449/ijate.676947

Cited By: 4

Abstract

Keywords

ca-MST , Panel design , Module length , Item discrimination sequence , Difficulty difference condition

References

Boztunç Öztürk, N. (2019). How the length and characteristics of routing module affect ability estimation in ca-MST?. Universal Journal of Educational Research, 7(1), 164-170. https://doi.org/10.13189/ujer.2019.070121
Chang, H., & Ying, Z. (1999). a-Stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211 222. https://doi.org/10.1177/01466219922031338
Chen, L. Y. (2010). An investigation of the optimal test design for multi-stage test using the generalized partical credit model (unpublished doctoral dissertation). The University of Texas at Austin. Retrieved from https://repositories.lib.utexas.edu/handle/2152/ETD-UT-2010-12-344
Hadadi, A., & Leucht, R. M. (1998). Some methods for detecting and understanding test speededness on timed multiple-choice tests. Academic Medicine, 73, 47-50. https://journals.lww.com/academicmedicine/Citation/1998/10000/TESTING_THE_TEST__Some_Methods_for_Detecting_and.42.aspx
Hambleton, R.K., & Xing, D. (2006). Optimal and nonoptimal computer-based test designs for making pass-fail decisions. Applied Measurement in Education, 19(3), 221-229. https://doi.org/10.1207/s15324818ame1903_4
Han, K. T. (2013). MSTGen: simulated data generator for multistage testing. Applied Psychological Measurement, 37(8), 666 668. https://doi.org/10.1177/0146621613499639
Han, K. T., & Guo, F. (2013). An approach to assembling optimal multistage testing modules on the fly (Report No. RR-13-01). Virginia: GMAC.
Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26, 44 52. https://doi.org/10.1111/j.1745 3992.2007.00093.x
Jodoin, M. G., Zenisky, A., & Hambleton, R. K. (2006). Comparison of the psychometric properties of several computer-based test design for credentialing exams with multiple purposes. Applied Measurement in Education, 19(3), 203-220. http://doi.org/10.1207/ s15324818ame1903_3
Kim, H., & Plake, B.S. (1993, April). Monte carlo simulation of two-stage testing and computerized adaptive testing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Atlanta.
Kim, J., Chung, H., Dodd, B.G., & Park, R. (2012). Panel design variations in the multistage test using the mixed-format tests. Educational and Psychological Measurement, 72(4), 574-588. https://doi.org/10.1177/0013164411428977
Kim, S., Moses, T., & Yoo, H. (2015). A comparison of IRT proficiency estimation methods uder adaptive multistage testing. Journal of Educational Measurement, 52(1), 70-79. https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12063
Kim, S., & Moses, T. (2014). An investigation of the impact of misrouting under two-stage multistage testing: A simulation study (Report No. RR-14-01). Princeton, NJ: English Testing Service.
Leucht, R. M. (2000, April) Implementing the computer-adaptive sequential testing (CAST) framework to mass produce high quality computer-adaptive and mastery tests. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.
Leucht, R., Brumfield, T., & Brithaupt K. (2006). A testlet assembly design for adaptive multistage tests. Applied Mesurement in Education, 19(3), 189 202. https://doi/abs/10.1207/s15324818ame1903_2
Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer-adaptive sequential testing. Journal of Educational Measurement, 35(3), 229-249.
Leucht, R. M., & Sireci, S. G. (2011). A review of models for computer-based testing. (Report No. RR-2011-12). New York: CollegeBoard. https://doi.org/10.1111/j.1745-3984.1998.tb00537.x
Lord, F. M. (1971). A theoretical study of two-stage testing. Psychometrika, 36(3), 227-242. https://doi.org/10.1007/BF02297844
Loyd, B. (1984, February). Efficiency and Precision in two-stage adaptive testing. Paper presented at the Annual Meeting of Eastern Educational Research Associaton, West Palm Beach, Fl.
Park, R. (2015). Investigating the impact of a mixed-format item pool on optimal test designs for multistage testing (Doctoral dissertation). The University of Texas at Austin. Retrieved from https://repositories.lib.utexas.edu/handle/2152/31011
Patsula, L.N. (1999). A comparison of computerized adaptive testing and multistage testing. (Doctoral dissertation). The University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=4283&context=dissertations_1
Rotou, O., Patsula, L., Steffen, M., & Rizavi, S. (2007, March). Comparison of multistage tests with computerized adaptive and paper and pencil tests. (Report No: RR-07-04). Princeton, NJ: English Testing Service.
Sarı, H.İ., Yahşi Sarı, H., & Huggins Manley, A.C. (2016). Computer adaptive multistage testing: Practical issues, challenges and principles. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 388 406. https://doi.org/10.21031/epod.280183
Schnipke, D.L., & Reese, L.M. (1999). A comparison of testlet-based test designs for computerized adaptive testing. (Report No: 97-01). Princeton, NJ: Law School Admission Council.
Sadeghi, K., & Khonbi, Z.A. (2017). An overview of differential item functioning in multistage computer adaptive testing using three-parameter logistic item response theory. Language testing in Asia, 7(1), 1-16. https://doi.org/10.1186/s40468-017-0038-z
van der Linden, W.J. (2005). Lineer models for optimal test design. New York: Springer.
Verschoor, A., & Eggen, T. (2014) Optimizing the test assembly and routing for multistage testing. In D. Yan., A. A. von Davier, & C., Lewis, (Ed.), Computerized Multistage Testing Theory and Applications (pp:135-150). Taylor & Francis Group.
Xing, D., & Hambleton, R., K. (2004). Impact of test design, item quality, and item bank size on the psychometric properties of computer-based credentialing examinations. Educational and Psychological Measurement, 64(1), 5 21. https://doi.org/10.1177/0013164403258393
Zeng, W. (2016). Making test batteries adaptive by using multistage testing techniques (Doctoral dissertation). The University of Wisconsin-Milwaukee. Retrieved from https://dc.uwm.edu/cgi/viewcontent.cgi?article=2241&context=etd
Zenisky, A.L., & Hambleton, R., K. (2014). Multistage test desing: Moving research results into practice. In D. Yan., A. A. von Davier, & C., Lewis, (Ed.), Computerized Multistage Testing Theory and Applications (pp. 21-37). Taylor and Francis Group.
Zenisky, A., L., & Jodoin, M., G. (1999). Current and future research in multistage testing. (Report No:370). Amherst, MA: University of Massachusetts School of Education.
Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment. (Doctoral dissertation). University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/dissertations/AAI3136800
Zheng, Y., Nozawa, Y., Gao, X., & Chang, H. (2012). Multistage adaptive testing for a large-scale classification test: Design, heuristic assembly, and comparison with other testing modes. (Report No:2012-6). Iowa City, IA.: ACT.
Zheng, Y., & Chang, H. (2015). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39(2), 104 118. https://doi.org/10.1177/0146621614544519

There are 34 citations in total.

Details

Primary Language	English
Subjects	Studies on Education
Journal Section	Articles
Authors	Melek Gülşah Şahin 0000-0001-5139-9777
Publication Date	June 13, 2020
Submission Date	January 19, 2020
Published in Issue	Year 2020 Volume: 7 Issue: 2

Cite

APA	Şahin, M. G. (2020). Analyzing Different Module Characteristics in Computer Adaptive Multistage Testing. International Journal of Assessment Tools in Education, 7(2), 191-206. https://doi.org/10.21449/ijate.676947

Cited By

Computerized Multistage Testing: Principles, Designs and Practices with R

Measurement: Interdisciplinary Research and Perspectives

https://doi.org/10.1080/15366367.2022.2158017

The Effect of Test Design on Misrouting in Computerized Multistage Testing

Uluslararası Türk Eğitim Bilimleri Dergisi

https://doi.org/10.46778/goputeb.1267319

A shortened test is feasible: Evaluating a large-scale multistage adaptive English language assessment

Language Testing

https://doi.org/10.1177/02655322231225426

Çok Aşamalı Testlerin Panel Deseni, Modül Uzunluğu, Örneklem Büyüklüğü ve Yetenek Parametresi Kestirim Yöntemleri Açısından Farklı Koşullar Altında Karşılaştırılması

Boğaziçi Üniversitesi Eğitim Dergisi

https://doi.org/10.52597/buje.1329338

Article Files

Full Text

23823 23825 23824