Investigation of Item Selection Methods According to Test Termination Rules in CAT Applications

Sema Sulak; Hülya Kelecioğlu

doi:10.21031/epod.530528

Research Article

Year 2019, Volume: 10 Issue: 3, 315 - 326, 04.09.2019

Sema Sulak , Hülya Kelecioğlu

https://doi.org/10.21031/epod.530528

Cited By: 3

Abstract

References

Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6(4), 431–444.
Chang, H.-H, Qian, J., Ying, Z. (2001). A-Stratified Multistage Adaptive Testing With b Blocking. Applied Psychological Measurement, 25(4), pp.333-341
Chang, H.-H, Ying, Z. (1996). A Global Information Approach to Computerized Adaptive Testing. Applied Psychological Measurement, 20, pp213-229
Chang, H.-H, Ying, Z. (1999). a-Stratified Multistage Testing. Applied Psychological Measurement, 23(3), pp211-222
Costa, D., Karino, C., Moura, F., Andrade, D. (2009). A Comparision of Three Methods of Item Selection for Computerized Adaptive Testing. 2009 GMAC Conference on Computerized Adaptive Testing, June,
Deng, H., Ansley, T., Chang, H. (2010). Stratified and Maximum Information Item Selection Procedures in Computer Adaptive Testing. Journal of Educational Measurement, Vol.47, No.2, pp 202-226.
Deng, H. & Chang, H.H. (2001). A-Stratified Computerized Adaptive Testing with Unequal Item Exposure across Strata. Presented at American Educational Research Association Annual Meeting 2001.Retrieved February 21, 2012 from https://www.learntechlib.org/p/93050/.
Eggen, T. H. J. M. (1999). Item Selection in Adaptive Testing with the Squential Probability Ratio Test. Applied Psychological Measurement, Vol.23, No.3., pp 249-261.Han, K. (2009). Gradual Maximum Information Ratio Approach to Item Selection in computerized Adaptive Testing. Graduate Management Admission Council Research Reports, RR-09-07, June 25, USA.
Han, K. (2010). Comparision of Non-Fisher Information Item Selection Criteria in Fixed Length Computerized Adaptive Testing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Denver.
Han, K. (2012). SimulCAT: Windows Application That Simulates Computerized Adaptive Test Administration. Applied Psychological Measurement, 36.
Işeri, A. I. (2002). Assessment of Students' Mathematics Achievement Through Computer Adaptive Testing Procedures. Unpublished doctoral dissertation. Middle East Technical University, Turkey.
Kalender, İ. (2011). Effects of Different Computerized Adaptive Testing Strategies on Recovery of Ability. Unpublished Doctoral Dissertation. Middle East Technical University, Ankara.
Kaptan, F. (1993). Yetenek Kestiriminde Adaptive (bireysellestirilmis) Test Uygulaması ile Geleneksel Kağıt-kalem Testi Uygulamasının Karşılaştırılması. Yayımlanmamış doktora tezi, Hacettepe Universitesi
Linda, T. (1996). A comparision of the Traditional Maximum Information Method and the Global Information Method in CAT Item Selection. Annual Meeting of the National Council on Measurement in Education, New York, April.
Orcutt, V. L. (2002). Computerized Adaptive Testing: Some Issues in Development. Annual Meeting of the Educational Research Exchange, University of North Texas, February, Denton, Texas.
Slater, S. C. (2001). Pretest Item Calibration Within The Computerized Adaptive Testing Environment. Unpublished Doctoral Dissertation, Graduate School of the University Massachusetts, Amherst.
Sireci, S. (2003). Computerized Adaptive Testing: An Introduction. Wall&Walz (Ed) Measuring Up: Assessment Issues for Teachers, Counselors and Administrators, CAPS Press, pp.12.,
Thissen, D. & Mislevy, R. J. (2000). Testing algorithms. In H. Wainer, (Eds.). Computerized Adaptive Testing: A primer, Mahwah, NH: Lawrence Erlbaum Associates, Inc, pp. 101-133.
Van Der Linden, W.J., Glas, C.A.W. (2010). Elements of Adaptive Testing, Statistics for Social and Behaviorel Sciences, Springer New York Dordrecht Heidelberg London, ISBN: 978-0-387-85459-5.
Veerkamp, W.J.J., Berger, M.P.F. (1997). Some New Item Selection Criteria for Adaptive Testing. Journal of Educational and Behavioral Statistics, Vol.22, No.2, pp 203-226.
Veldkamp, B.P. (2012). Ensurind The Future of Computerized Adaptive Testing. In Theo, J.H.M; Veldkamp, B.P. (ed). Psychometrics in Practice at RCEC. University of Twente, Netherlands, 978-90-365-3374-4.
Wainer, H., Dorans, N., Flaughter,. R., Green, B., Mislevy, R., Steinberg, L., Thissen, D. (2000) Computerized adaptive testing: A primer. Hillsdale. NJ: Lawrence Erlbaum Associates.
Wang, T., Visposel, W. (1998). Properties of Ability Estimation Methods in Computerized Adaptive Testing. Journal of Educational Measurement, Vol.35, No.2, pp 109-135.
Weiss, D. J. (1983). Latent Trait Theory and Adaptive Testing. In David J. Weiss (ed.). New Horizons in Testing: Latent Trait Test Theory and Computerized Adaptive Testing. (pp. 5-7). New York: Academic Press.
Weiss, D. J., Kingsbury, G. G. (1984). Application of Computerized Adaptive Testing to Educational Problems. Journal of Educational Measurement, 21, 361-375.
Weissman, A. (2003). Assessing the Efficiency of Item Selection in Computerized Adaptive Testing. Paper presented at the Annual Meeting of the American Educational Research Association, April, Chicago.
Wen, H., Chang, H., Hau, K. (2001). Adaption of a-stratified Method in Variable Length Computerized Adaptive Testing. American Educational Research Association Annual Meeting, Seattle.
Yi, Q., Chang, H. (2003). a-Stratified CAT Design With Content Blocking. British Journal of Mathematical and Statistical Psychology, vol. 56, pp 359–378.

Investigation of Item Selection Methods According to Test Termination Rules in CAT Applications

Year 2019, Volume: 10 Issue: 3, 315 - 326, 04.09.2019

Sema Sulak , Hülya Kelecioğlu

https://doi.org/10.21031/epod.530528

Cited By: 3

Abstract

In this
research, computerized adaptive testing item selection methods were
investigated in regard to ability estimation methods and test termination
rules. For this purpose, an item pool including 250 items and 2000 people were
simulated (M = 0, SD = 1). A total of thirty computerized adaptive testing
(CAT) conditions were created according to item selection methods (Maximum
Fisher Information, a-stratification, Likelihood Weight Information Criterion,
Gradual Information Ratio, and Kullback-Leibler), ability estimation methods
(Maximum Likelihood Estimation, Expected a Posteriori Distribution), and test
termination rules (40 items, SE < .20 and SE < .40). According to the
fixed test-length stopping rule, the SE values that were obtained by using the
Maximum Likelihood Estimation method were found to be higher than the SE values
that were obtained by using the Expected a Posteriori Distribution ability
estimation method. When ability estimation was Maximum Likelihood, the highest
SE value was obtained from a-stratification item selection method when the test
length is smaller then 30. Whereas, Kullback-Leibler item selection method
yielded the highest SE value when the test length is larger then 30. According
to Expected a Posteriori ability estimation method, the highest SE value was
obtained from a-stratification item selection method in all test lengths. In
the conditions where test termination rule was SE < .20, and Maximum
Likelihood Ability Estimation method was used, the lowest and highest average
number of items were obtained from the Gradual Information Ratio and Maximum
Fisher Information item selection method, respectively. Furthermore, when the
SE is lower than .20 and Expected a Posteriori ability estimation method was
utilized, the lowest average number of items was obtained through
Kullback-Leibler, and the highest was obtained through Likelihood Weight
Information Criterion item selection method. In the conditions where the test
termination rule was SE < .40, and ability estimation method was Maximum
Likelihood Estimation, the maximum and minimum number of items were obtained by
using Maximum Fisher Information and Kullback-Leibler item selection methods
respectively. Additionally, when Expected a Posteriori ability estimation was
used, the maximum and minimum number of items were obtained via Maximum Fisher
Information and a-stratification item selection methods. For the cases where
the stopping rule was SE < .20 and SE < .40 and Maximum Likelihood
Estimation method was used, the average number of items were found to be
highest in all item selection methods.

Keywords

Computerized adaptive testing, maximum fisher information, a-stratification, likelihood weight information criterion, gradual information ratio, kullback-leibler

References

Bock, R. D., & Mislevy, R. J. (1982). Adaptive EAP Estimation of Ability in a Microcomputer Environment. Applied Psychological Measurement, 6(4), 431–444.
Chang, H.-H, Qian, J., Ying, Z. (2001). A-Stratified Multistage Adaptive Testing With b Blocking. Applied Psychological Measurement, 25(4), pp.333-341
Chang, H.-H, Ying, Z. (1996). A Global Information Approach to Computerized Adaptive Testing. Applied Psychological Measurement, 20, pp213-229
Chang, H.-H, Ying, Z. (1999). a-Stratified Multistage Testing. Applied Psychological Measurement, 23(3), pp211-222
Costa, D., Karino, C., Moura, F., Andrade, D. (2009). A Comparision of Three Methods of Item Selection for Computerized Adaptive Testing. 2009 GMAC Conference on Computerized Adaptive Testing, June,
Deng, H., Ansley, T., Chang, H. (2010). Stratified and Maximum Information Item Selection Procedures in Computer Adaptive Testing. Journal of Educational Measurement, Vol.47, No.2, pp 202-226.
Deng, H. & Chang, H.H. (2001). A-Stratified Computerized Adaptive Testing with Unequal Item Exposure across Strata. Presented at American Educational Research Association Annual Meeting 2001.Retrieved February 21, 2012 from https://www.learntechlib.org/p/93050/.
Eggen, T. H. J. M. (1999). Item Selection in Adaptive Testing with the Squential Probability Ratio Test. Applied Psychological Measurement, Vol.23, No.3., pp 249-261.Han, K. (2009). Gradual Maximum Information Ratio Approach to Item Selection in computerized Adaptive Testing. Graduate Management Admission Council Research Reports, RR-09-07, June 25, USA.
Han, K. (2010). Comparision of Non-Fisher Information Item Selection Criteria in Fixed Length Computerized Adaptive Testing. Paper presented at the Annual Meeting of the National Council on Measurement in Education, Denver.
Han, K. (2012). SimulCAT: Windows Application That Simulates Computerized Adaptive Test Administration. Applied Psychological Measurement, 36.
Işeri, A. I. (2002). Assessment of Students' Mathematics Achievement Through Computer Adaptive Testing Procedures. Unpublished doctoral dissertation. Middle East Technical University, Turkey.
Kalender, İ. (2011). Effects of Different Computerized Adaptive Testing Strategies on Recovery of Ability. Unpublished Doctoral Dissertation. Middle East Technical University, Ankara.
Kaptan, F. (1993). Yetenek Kestiriminde Adaptive (bireysellestirilmis) Test Uygulaması ile Geleneksel Kağıt-kalem Testi Uygulamasının Karşılaştırılması. Yayımlanmamış doktora tezi, Hacettepe Universitesi
Linda, T. (1996). A comparision of the Traditional Maximum Information Method and the Global Information Method in CAT Item Selection. Annual Meeting of the National Council on Measurement in Education, New York, April.
Orcutt, V. L. (2002). Computerized Adaptive Testing: Some Issues in Development. Annual Meeting of the Educational Research Exchange, University of North Texas, February, Denton, Texas.
Slater, S. C. (2001). Pretest Item Calibration Within The Computerized Adaptive Testing Environment. Unpublished Doctoral Dissertation, Graduate School of the University Massachusetts, Amherst.
Sireci, S. (2003). Computerized Adaptive Testing: An Introduction. Wall&Walz (Ed) Measuring Up: Assessment Issues for Teachers, Counselors and Administrators, CAPS Press, pp.12.,
Thissen, D. & Mislevy, R. J. (2000). Testing algorithms. In H. Wainer, (Eds.). Computerized Adaptive Testing: A primer, Mahwah, NH: Lawrence Erlbaum Associates, Inc, pp. 101-133.
Van Der Linden, W.J., Glas, C.A.W. (2010). Elements of Adaptive Testing, Statistics for Social and Behaviorel Sciences, Springer New York Dordrecht Heidelberg London, ISBN: 978-0-387-85459-5.
Veerkamp, W.J.J., Berger, M.P.F. (1997). Some New Item Selection Criteria for Adaptive Testing. Journal of Educational and Behavioral Statistics, Vol.22, No.2, pp 203-226.
Veldkamp, B.P. (2012). Ensurind The Future of Computerized Adaptive Testing. In Theo, J.H.M; Veldkamp, B.P. (ed). Psychometrics in Practice at RCEC. University of Twente, Netherlands, 978-90-365-3374-4.
Wainer, H., Dorans, N., Flaughter,. R., Green, B., Mislevy, R., Steinberg, L., Thissen, D. (2000) Computerized adaptive testing: A primer. Hillsdale. NJ: Lawrence Erlbaum Associates.
Wang, T., Visposel, W. (1998). Properties of Ability Estimation Methods in Computerized Adaptive Testing. Journal of Educational Measurement, Vol.35, No.2, pp 109-135.
Weiss, D. J. (1983). Latent Trait Theory and Adaptive Testing. In David J. Weiss (ed.). New Horizons in Testing: Latent Trait Test Theory and Computerized Adaptive Testing. (pp. 5-7). New York: Academic Press.
Weiss, D. J., Kingsbury, G. G. (1984). Application of Computerized Adaptive Testing to Educational Problems. Journal of Educational Measurement, 21, 361-375.
Weissman, A. (2003). Assessing the Efficiency of Item Selection in Computerized Adaptive Testing. Paper presented at the Annual Meeting of the American Educational Research Association, April, Chicago.
Wen, H., Chang, H., Hau, K. (2001). Adaption of a-stratified Method in Variable Length Computerized Adaptive Testing. American Educational Research Association Annual Meeting, Seattle.
Yi, Q., Chang, H. (2003). a-Stratified CAT Design With Content Blocking. British Journal of Mathematical and Statistical Psychology, vol. 56, pp 359–378.

There are 28 citations in total.

Details

Primary Language	English
Journal Section	Articles
Authors	Sema Sulak 0000-0002-2849-321X Hülya Kelecioğlu 0000-0002-0741-9934
Publication Date	September 4, 2019
Acceptance Date	July 6, 2019
Published in Issue	Year 2019 Volume: 10 Issue: 3

Cite

APA	Sulak, S., & Kelecioğlu, H. (2019). Investigation of Item Selection Methods According to Test Termination Rules in CAT Applications. Journal of Measurement and Evaluation in Education and Psychology, 10(3), 315-326. https://doi.org/10.21031/epod.530528

Cited By

Investigation of Measurement Precision and Test Lengths in Computerized Adaptive Tests in Different Conditions

Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi

https://doi.org/10.21031/epod.1068572

The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing

Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi

https://doi.org/10.21031/epod.1140757

Applicability And Efficiency of a Polytomous IRT-Based Computerized Adaptive Test for Measuring Psychological Traits

Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi

https://doi.org/10.21031/epod.1148313

Download Cover Image

Article Files

Full Text