Araştırma Makalesi
BibTex RIS Kaynak Göster

Analyzing Different Module Characteristics in Computer Adaptive Multistage Testing

Yıl 2020, Cilt: 7 Sayı: 2, 191 - 206, 13.06.2020
https://doi.org/10.21449/ijate.676947

Öz

Computer Adaptive Multistage Testing (ca-MST), which take the advantage of computer technology and adaptive test form, are widely used, and are now a popular issue of assessment and evaluation. This study aims at analyzing the effect of different panel designs, module lengths, and different sequence of a parameter value across stages and change in b parameter range on measurement precision in ca-MST implementations. The study has been carried out as a simulation. MSTGen simulation software tool was used for that purpose. 5000 simulees derived from normal distribution (N (0,1)) were simulated. 60 different conditions (two panel designs (1-3-3; 1-2-2), three module lengths (10-15-20), 5 different a parameter sequences (“0.8; 0.8; 0.8” - “1.4; 0.8; 0.8”-“0.8;1.4; 0.8” - “0.8; 0.8;1,4” - “1.4; 1,4; 1.4”) and two b parameter difference (small; large) conditions) were taken into consideration during analysis. Correlation, RMSE and AAD values of conditions were calculated. Conditional RMSE values corresponding to each ability level are given in a graph. Dissimilar to other studies in the literature, this study examines b parameter difference condition in three-stage tests and its interaction with a parameter sequence. Study results show that measurement precision increases as the number and length of the modules increase. Errors in measurement decrease as item discrimination values increase in all stages. Including items with a high value of item discrimination in the second or last stage contributes to measurement precision. In extreme ability levels, large difficulty difference condition produces lower error values when compared to small difficulty difference condition.

Kaynakça

  • Boztunç Öztürk, N. (2019). How the length and characteristics of routing module affect ability estimation in ca-MST?. Universal Journal of Educational Research, 7(1), 164-170. https://doi.org/10.13189/ujer.2019.070121
  • Chang, H., & Ying, Z. (1999). a-Stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211 222. https://doi.org/10.1177/01466219922031338
  • Chen, L. Y. (2010). An investigation of the optimal test design for multi-stage test using the generalized partical credit model (unpublished doctoral dissertation). The University of Texas at Austin. Retrieved from https://repositories.lib.utexas.edu/handle/2152/ETD-UT-2010-12-344
  • Hadadi, A., & Leucht, R. M. (1998). Some methods for detecting and understanding test speededness on timed multiple-choice tests. Academic Medicine, 73, 47-50. https://journals.lww.com/academicmedicine/Citation/1998/10000/TESTING_THE_TEST__Some_Methods_for_Detecting_and.42.aspx
  • Hambleton, R.K., & Xing, D. (2006). Optimal and nonoptimal computer-based test designs for making pass-fail decisions. Applied Measurement in Education, 19(3), 221-229. https://doi.org/10.1207/s15324818ame1903_4
  • Han, K. T. (2013). MSTGen: simulated data generator for multistage testing. Applied Psychological Measurement, 37(8), 666 668. https://doi.org/10.1177/0146621613499639
  • Han, K. T., & Guo, F. (2013). An approach to assembling optimal multistage testing modules on the fly (Report No. RR-13-01). Virginia: GMAC.
  • Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26, 44 52. https://doi.org/10.1111/j.1745 3992.2007.00093.x
  • Jodoin, M. G., Zenisky, A., & Hambleton, R. K. (2006). Comparison of the psychometric properties of several computer-based test design for credentialing exams with multiple purposes. Applied Measurement in Education, 19(3), 203-220. http://doi.org/10.1207/ s15324818ame1903_3
  • Kim, H., & Plake, B.S. (1993, April). Monte carlo simulation of two-stage testing and computerized adaptive testing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Atlanta.
  • Kim, J., Chung, H., Dodd, B.G., & Park, R. (2012). Panel design variations in the multistage test using the mixed-format tests. Educational and Psychological Measurement, 72(4), 574-588. https://doi.org/10.1177/0013164411428977
  • Kim, S., Moses, T., & Yoo, H. (2015). A comparison of IRT proficiency estimation methods uder adaptive multistage testing. Journal of Educational Measurement, 52(1), 70-79. https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12063
  • Kim, S., & Moses, T. (2014). An investigation of the impact of misrouting under two-stage multistage testing: A simulation study (Report No. RR-14-01). Princeton, NJ: English Testing Service.
  • Leucht, R. M. (2000, April) Implementing the computer-adaptive sequential testing (CAST) framework to mass produce high quality computer-adaptive and mastery tests. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.
  • Leucht, R., Brumfield, T., & Brithaupt K. (2006). A testlet assembly design for adaptive multistage tests. Applied Mesurement in Education, 19(3), 189 202. https://doi/abs/10.1207/s15324818ame1903_2
  • Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer-adaptive sequential testing. Journal of Educational Measurement, 35(3), 229-249.
  • Leucht, R. M., & Sireci, S. G. (2011). A review of models for computer-based testing. (Report No. RR-2011-12). New York: CollegeBoard. https://doi.org/10.1111/j.1745-3984.1998.tb00537.x
  • Lord, F. M. (1971). A theoretical study of two-stage testing. Psychometrika, 36(3), 227-242. https://doi.org/10.1007/BF02297844
  • Loyd, B. (1984, February). Efficiency and Precision in two-stage adaptive testing. Paper presented at the Annual Meeting of Eastern Educational Research Associaton, West Palm Beach, Fl.
  • Park, R. (2015). Investigating the impact of a mixed-format item pool on optimal test designs for multistage testing (Doctoral dissertation). The University of Texas at Austin. Retrieved from https://repositories.lib.utexas.edu/handle/2152/31011
  • Patsula, L.N. (1999). A comparison of computerized adaptive testing and multistage testing. (Doctoral dissertation). The University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=4283&context=dissertations_1
  • Rotou, O., Patsula, L., Steffen, M., & Rizavi, S. (2007, March). Comparison of multistage tests with computerized adaptive and paper and pencil tests. (Report No: RR-07-04). Princeton, NJ: English Testing Service.
  • Sarı, H.İ., Yahşi Sarı, H., & Huggins Manley, A.C. (2016). Computer adaptive multistage testing: Practical issues, challenges and principles. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 388 406. https://doi.org/10.21031/epod.280183
  • Schnipke, D.L., & Reese, L.M. (1999). A comparison of testlet-based test designs for computerized adaptive testing. (Report No: 97-01). Princeton, NJ: Law School Admission Council.
  • Sadeghi, K., & Khonbi, Z.A. (2017). An overview of differential item functioning in multistage computer adaptive testing using three-parameter logistic item response theory. Language testing in Asia, 7(1), 1-16. https://doi.org/10.1186/s40468-017-0038-z
  • van der Linden, W.J. (2005). Lineer models for optimal test design. New York: Springer.
  • Verschoor, A., & Eggen, T. (2014) Optimizing the test assembly and routing for multistage testing. In D. Yan., A. A. von Davier, & C., Lewis, (Ed.), Computerized Multistage Testing Theory and Applications (pp:135-150). Taylor & Francis Group.
  • Xing, D., & Hambleton, R., K. (2004). Impact of test design, item quality, and item bank size on the psychometric properties of computer-based credentialing examinations. Educational and Psychological Measurement, 64(1), 5 21. https://doi.org/10.1177/0013164403258393
  • Zeng, W. (2016). Making test batteries adaptive by using multistage testing techniques (Doctoral dissertation). The University of Wisconsin-Milwaukee. Retrieved from https://dc.uwm.edu/cgi/viewcontent.cgi?article=2241&context=etd
  • Zenisky, A.L., & Hambleton, R., K. (2014). Multistage test desing: Moving research results into practice. In D. Yan., A. A. von Davier, & C., Lewis, (Ed.), Computerized Multistage Testing Theory and Applications (pp. 21-37). Taylor and Francis Group.
  • Zenisky, A., L., & Jodoin, M., G. (1999). Current and future research in multistage testing. (Report No:370). Amherst, MA: University of Massachusetts School of Education.
  • Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment. (Doctoral dissertation). University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/dissertations/AAI3136800
  • Zheng, Y., Nozawa, Y., Gao, X., & Chang, H. (2012). Multistage adaptive testing for a large-scale classification test: Design, heuristic assembly, and comparison with other testing modes. (Report No:2012-6). Iowa City, IA.: ACT.
  • Zheng, Y., & Chang, H. (2015). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39(2), 104 118. https://doi.org/10.1177/0146621614544519

Analyzing Different Module Characteristics in Computer Adaptive Multistage Testing

Yıl 2020, Cilt: 7 Sayı: 2, 191 - 206, 13.06.2020
https://doi.org/10.21449/ijate.676947

Öz

Computer Adaptive Multistage Testing (ca-MST), which take the advantage of computer technology and adaptive test form, are widely used, and are now a popular issue of assessment and evaluation. This study aims at analyzing the effect of different panel designs, module lengths, and different sequence of a parameter value across stages and change in b parameter range on measurement precision in ca-MST implementations. The study has been carried out as a simulation. MSTGen simulation software tool was used for that purpose. 5000 simulees derived from normal distribution (N (0,1)) were simulated. 60 different conditions (two panel designs (1-3-3; 1-2-2), three module lengths (10-15-20), 5 different a parameter sequences (“0.8; 0.8; 0.8” - “1.4; 0.8; 0.8”-“0.8;1.4; 0.8” - “0.8; 0.8;1,4” - “1.4; 1,4; 1.4”) and two b parameter difference (small; large) conditions) were taken into consideration during analysis. Correlation, RMSE and AAD values of conditions were calculated. Conditional RMSE values corresponding to each ability level are given in a graph. Dissimilar to other studies in the literature, this study examines b parameter difference condition in three-stage tests and its interaction with a parameter sequence. Study results show that measurement precision increases as the number and length of the modules increase. Errors in measurement decrease as item discrimination values increase in all stages. Including items with a high value of item discrimination in the second or last stage contributes to measurement precision. In extreme ability levels, large difficulty difference condition produces lower error values when compared to small difficulty difference condition.

Kaynakça

  • Boztunç Öztürk, N. (2019). How the length and characteristics of routing module affect ability estimation in ca-MST?. Universal Journal of Educational Research, 7(1), 164-170. https://doi.org/10.13189/ujer.2019.070121
  • Chang, H., & Ying, Z. (1999). a-Stratified multistage computerized adaptive testing. Applied Psychological Measurement, 23(3), 211 222. https://doi.org/10.1177/01466219922031338
  • Chen, L. Y. (2010). An investigation of the optimal test design for multi-stage test using the generalized partical credit model (unpublished doctoral dissertation). The University of Texas at Austin. Retrieved from https://repositories.lib.utexas.edu/handle/2152/ETD-UT-2010-12-344
  • Hadadi, A., & Leucht, R. M. (1998). Some methods for detecting and understanding test speededness on timed multiple-choice tests. Academic Medicine, 73, 47-50. https://journals.lww.com/academicmedicine/Citation/1998/10000/TESTING_THE_TEST__Some_Methods_for_Detecting_and.42.aspx
  • Hambleton, R.K., & Xing, D. (2006). Optimal and nonoptimal computer-based test designs for making pass-fail decisions. Applied Measurement in Education, 19(3), 221-229. https://doi.org/10.1207/s15324818ame1903_4
  • Han, K. T. (2013). MSTGen: simulated data generator for multistage testing. Applied Psychological Measurement, 37(8), 666 668. https://doi.org/10.1177/0146621613499639
  • Han, K. T., & Guo, F. (2013). An approach to assembling optimal multistage testing modules on the fly (Report No. RR-13-01). Virginia: GMAC.
  • Hendrickson, A. (2007). An NCME instructional module on multistage testing. Educational Measurement: Issues and Practice, 26, 44 52. https://doi.org/10.1111/j.1745 3992.2007.00093.x
  • Jodoin, M. G., Zenisky, A., & Hambleton, R. K. (2006). Comparison of the psychometric properties of several computer-based test design for credentialing exams with multiple purposes. Applied Measurement in Education, 19(3), 203-220. http://doi.org/10.1207/ s15324818ame1903_3
  • Kim, H., & Plake, B.S. (1993, April). Monte carlo simulation of two-stage testing and computerized adaptive testing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Atlanta.
  • Kim, J., Chung, H., Dodd, B.G., & Park, R. (2012). Panel design variations in the multistage test using the mixed-format tests. Educational and Psychological Measurement, 72(4), 574-588. https://doi.org/10.1177/0013164411428977
  • Kim, S., Moses, T., & Yoo, H. (2015). A comparison of IRT proficiency estimation methods uder adaptive multistage testing. Journal of Educational Measurement, 52(1), 70-79. https://onlinelibrary.wiley.com/doi/abs/10.1111/jedm.12063
  • Kim, S., & Moses, T. (2014). An investigation of the impact of misrouting under two-stage multistage testing: A simulation study (Report No. RR-14-01). Princeton, NJ: English Testing Service.
  • Leucht, R. M. (2000, April) Implementing the computer-adaptive sequential testing (CAST) framework to mass produce high quality computer-adaptive and mastery tests. Paper presented at the Annual Meeting of the National Council on Measurement in Education, New Orleans, LA.
  • Leucht, R., Brumfield, T., & Brithaupt K. (2006). A testlet assembly design for adaptive multistage tests. Applied Mesurement in Education, 19(3), 189 202. https://doi/abs/10.1207/s15324818ame1903_2
  • Luecht, R. M., & Nungester, R. J. (1998). Some practical examples of computer-adaptive sequential testing. Journal of Educational Measurement, 35(3), 229-249.
  • Leucht, R. M., & Sireci, S. G. (2011). A review of models for computer-based testing. (Report No. RR-2011-12). New York: CollegeBoard. https://doi.org/10.1111/j.1745-3984.1998.tb00537.x
  • Lord, F. M. (1971). A theoretical study of two-stage testing. Psychometrika, 36(3), 227-242. https://doi.org/10.1007/BF02297844
  • Loyd, B. (1984, February). Efficiency and Precision in two-stage adaptive testing. Paper presented at the Annual Meeting of Eastern Educational Research Associaton, West Palm Beach, Fl.
  • Park, R. (2015). Investigating the impact of a mixed-format item pool on optimal test designs for multistage testing (Doctoral dissertation). The University of Texas at Austin. Retrieved from https://repositories.lib.utexas.edu/handle/2152/31011
  • Patsula, L.N. (1999). A comparison of computerized adaptive testing and multistage testing. (Doctoral dissertation). The University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/cgi/viewcontent.cgi?article=4283&context=dissertations_1
  • Rotou, O., Patsula, L., Steffen, M., & Rizavi, S. (2007, March). Comparison of multistage tests with computerized adaptive and paper and pencil tests. (Report No: RR-07-04). Princeton, NJ: English Testing Service.
  • Sarı, H.İ., Yahşi Sarı, H., & Huggins Manley, A.C. (2016). Computer adaptive multistage testing: Practical issues, challenges and principles. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 388 406. https://doi.org/10.21031/epod.280183
  • Schnipke, D.L., & Reese, L.M. (1999). A comparison of testlet-based test designs for computerized adaptive testing. (Report No: 97-01). Princeton, NJ: Law School Admission Council.
  • Sadeghi, K., & Khonbi, Z.A. (2017). An overview of differential item functioning in multistage computer adaptive testing using three-parameter logistic item response theory. Language testing in Asia, 7(1), 1-16. https://doi.org/10.1186/s40468-017-0038-z
  • van der Linden, W.J. (2005). Lineer models for optimal test design. New York: Springer.
  • Verschoor, A., & Eggen, T. (2014) Optimizing the test assembly and routing for multistage testing. In D. Yan., A. A. von Davier, & C., Lewis, (Ed.), Computerized Multistage Testing Theory and Applications (pp:135-150). Taylor & Francis Group.
  • Xing, D., & Hambleton, R., K. (2004). Impact of test design, item quality, and item bank size on the psychometric properties of computer-based credentialing examinations. Educational and Psychological Measurement, 64(1), 5 21. https://doi.org/10.1177/0013164403258393
  • Zeng, W. (2016). Making test batteries adaptive by using multistage testing techniques (Doctoral dissertation). The University of Wisconsin-Milwaukee. Retrieved from https://dc.uwm.edu/cgi/viewcontent.cgi?article=2241&context=etd
  • Zenisky, A.L., & Hambleton, R., K. (2014). Multistage test desing: Moving research results into practice. In D. Yan., A. A. von Davier, & C., Lewis, (Ed.), Computerized Multistage Testing Theory and Applications (pp. 21-37). Taylor and Francis Group.
  • Zenisky, A., L., & Jodoin, M., G. (1999). Current and future research in multistage testing. (Report No:370). Amherst, MA: University of Massachusetts School of Education.
  • Zenisky, A. L. (2004). Evaluating the effects of several multi-stage testing design variables on selected psychometric outcomes for certification and licensure assessment. (Doctoral dissertation). University of Massachusetts Amherst. Retrieved from https://scholarworks.umass.edu/dissertations/AAI3136800
  • Zheng, Y., Nozawa, Y., Gao, X., & Chang, H. (2012). Multistage adaptive testing for a large-scale classification test: Design, heuristic assembly, and comparison with other testing modes. (Report No:2012-6). Iowa City, IA.: ACT.
  • Zheng, Y., & Chang, H. (2015). On-the-fly assembled multistage adaptive testing. Applied Psychological Measurement, 39(2), 104 118. https://doi.org/10.1177/0146621614544519
Toplam 34 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Eğitim Üzerine Çalışmalar
Bölüm Makaleler
Yazarlar

Melek Gülşah Şahin 0000-0001-5139-9777

Yayımlanma Tarihi 13 Haziran 2020
Gönderilme Tarihi 19 Ocak 2020
Yayımlandığı Sayı Yıl 2020 Cilt: 7 Sayı: 2

Kaynak Göster

APA Şahin, M. G. (2020). Analyzing Different Module Characteristics in Computer Adaptive Multistage Testing. International Journal of Assessment Tools in Education, 7(2), 191-206. https://doi.org/10.21449/ijate.676947

23823             23825             23824