Effects of Item Pool Characteristics on Ability Estimate and Item Pool Utilization: A Simulation Study

Nagihan Boztunç Öztürk; Melek Gülşah Şahin

Araştırma Makalesi

Madde Havuzu Özelliklerinin Yetenek Kestirimi ve Madde Havuzu Kullanımına Etkileri: Bir Simülasyon Çalışması

Yıl 2019, Cilt: 34 Sayı: 2, 473 - 486, 30.04.2019

Nagihan Boztunç Öztürk Melek Gülşah Şahin

Öz

Bireyselleştirilmiş bilgisayarlı test uygulaması için
madde havuzunun geliştirilmesi uzun ve zahmetli bir süreç gerektirmektedir. Bu süreç
hem maddi hem de zaman anlamında yorucu olabilir. Bu nedenle ‘Optimal bir madde
havuzu nasıl olmalıdır?, Bir madde havuzunda en az kaç madde yer almalıdır?’ gibi
sorularla sıklıkla karşılaşılmaktadır. Optimal bir madde havuzunda bulunması gereken
özellikler hakkında yapılan çalışmalar çeşitlilik göstermekle birlikte özellikle
madde havuzunun büyüklüğü ile ilgili bir fikir birliği sağlanamamıştır. Bu çalışmada;
farklı madde sayısına ve madde dağılımlarına sahip madde havuzlarının yetenek kestirimine
ve madde havuzlarının kullanımına etkisi incelenmiştir. Çalışmada 36 farklı madde
havuzu SimulCAT yazılımı kullanılarak üretilmiştir. 1000 birey kullanılarak tek
oturumluk CAT ortamları simüle ve çalışmada iki farklı sonlandırma kuralı kullanılmıştır.
Çalışmanın sonucu genel olarak ele alındığında madde havuzu büyüklüğü belli bir
büyüklüğe kadar arttıkça ölçme kesinliğinin arttığı, kullanılmayan madde sayısının
azaldığı görülmüştür. Sonuçlara b parametresi
özelinde bakıldığında madde havuzu büyüdükçe b parametresi dağılımının değerler üstündeki etkisinin azaldığı görülmüştür.

Anahtar Kelimeler

Bireyselleştirilmiş bilgisayarlı test, madde havuzu kalitesi, madde havuzu büyüklüğü, madde havuzu kullanımı, yetenek kestirimi

Kaynakça

Ariel, A., van der Linden, W. J., & Veldkamp, B. P. (2006). A strategy for optimizing item-pool management. Journal of Educational Measurement, 43(2), 85-96.
Bergstrom, B. A., & Lunz, M. E. (1999). CAT for certification and licensure. In F. Drasgow, & J. B. Olson-Buchanan (Eds.), Innovations in computerized assessment (pp. 67-91). Mahwah, NJ: Lawrence Erlbaum.
Boyd, A., Dodd, B., & Choi, S. (2010). Polytomous models in computerized adaptive testing. In M. L. Nering, & R. Ostini (Eds.), Handbook of polytomous item response theory models (pp. 229-255). New York: Routledge.
Boyd, M. A. (2003). Strategies for controlling testlet exposure rates in computerized adaptive testing systems (Unpublished Doctoral dissertation). The University of Texas, Austin.
Chen, S.Y., Ankenmann, R.D., & Spray, J.A. (2003). The relationship between item exposure and test overlap in computerized adaptive testing. Journal of Educational Measurement, 40(2), 129-145.
Davis, L. L. (2002). Strategies for controlling item exposure in computerized adaptive testing with polytomously scored items (Unpublished Doctoral dissertation). The University of Texas at Austin, Austin.
Flaugher, R. (2000). Item pools. In H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 37-59). Mahwah, NJ: Lawrence Erlbaum.
Gorin, J. S., Dodd, B. G., Fitzpatrick, S. J., & Shieh, Y. (2005). Computerized adaptive testing with the partial credit model: Estimation procedure, population distributions, and item pool characteristics. Applied Psychological Measurement, 29(6), 433-456.
Gu, L., & Reckase, M. D. (2007). Designing optimal item pools for computerized adaptive tests with sympson-Hetter exposure control. Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing.
Hambleton, R.K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. New York: Sage publication.
Han, K. T. (2011). User’s manual: SimulCAT. Retrieved 01 November 2017 from http://www.umass.edu/remp/software/simcata/simulcat/SimulCAT_Manual.pdf.
Han, K. T. (2012). An efficiency balanced information criterion for item selection in computerized adaptive testing. Journal of Educational Measurement, 49(3), 225-246.
He, W., & Reckase, M. D. (2013). Item pool design for an operational variable length computerized adaptive test. Educational and Psychological Measurement, 74(3), 473-494.
Jacobsen, J., Ackermann, R., Eguez, J., Ganguli, D., Rickard, P., & Taylor, L. (2011). Design of a computer-adaptive test to measure English literacy and numeracy in the Singapore workforce: considerations, benefits, and implications. Journal of Applied Testing Technology, 12(SI), 1-26.
Kingsbury, G. G., & Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2(4), 359-375.
Millman, J., & Arter, J. A. (1984). Issues in item banking. Journal of Educational Measurement, 21(4), 315-330.
Parshall, C., Spray, J., Kalohn, J., & Davey, T. (2002). Practical considerations in computer-based testing. New York: Springer Verlag.
Reckase, M. D. (1989). Adaptive testing: The evolution of a good idea. Educational measurement: Issues and practice, 8(3), 11-15.
Reckase, M. D. (2010). Designing item pools to optimize functioning of a computerized adaptive test. Psychological Test and Assessment Modeling, 52(2), 127-141.
Segall, D. O. (2004). Computerized Adaptive Testing. Encyclopaedia of Social Measurement, Academic Press. Retrieved from http://iacat.org/sites/default/files/biblio/se04-01.pdf.
Stocking, M. L. (1994). Three practical issues for modern adaptive testing item pools (ETS Research Report No. 93-2). Educational Testing Service: Princeton, NJ.
Thissen, D., & Mislevy, R. J. (2000). Testing algorithms. In H. Wainer (Ed.), Computerized Adaptive Testing: A Primer (2nd ed., pp. 101-135). London: Routledge.
Thompson, N. A., & Weiss, D. J. (2011). A Framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation, 16(1).
Urry, V. W. (1977). Tailored testing: a successful application of latent trait theory. Journal of Educational Measurement, 14(2), 181-196.
van der Linden, W. J., & Glas, C. A. V. (2002). Computerized adaptive testing: theory and practice. USA: Kluwer Academic.
Veldkamp, B. P., & van der Linden, W. P. (2010). Designing item pools for adaptive testing. In W. J. van der Linden, & C. A. Glas (Eds.), Elements of Adaptive Testing (pp. 231-245). New York: Springer.
Wang, T., & Kolen, M. J. (2001). Evaluating comparability in computerized adaptive testing: Issues, criteria and example. Journal of Educational Measurement, 38(1), 19-49.
Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35(2), 109-135.
Weiss, D. J. (1983). New Horizons in Testing. New York: Academic Press.
Weiss, D. J. (1985). Adaptive testing by computer. Journal of Consulting and Clinical Psychology, 53(6), 774-789.
Wise, S. L. (1997). An Evaluation of the Item Pools Used for Computerized Adaptive Test Versions of The Maryland Functional Tests. A Report Prepared for the Assessment Branch of the Maryland State Department of Education. Retrieved 10 March 2017 from https://marces.org/mdarch/pdf/M032045.pdf.
Xing, D., & Hambleton, R. K. (2004). Impacts of test design, item quality, and item bank size on the psychometric properties of computer-based credentialing examinations. Educational and Psychological Measurement, 64(1), 5-21.
Zhou, X., & Reckase, M. D. (2014). Optimal item pool design for computerized adaptive tests with polytomous items using GPCM. Psychological Test and Assessment Modeling, 56(3), 255-274.

Effects of Item Pool Characteristics on Ability Estimate and Item Pool Utilization: A Simulation Study

Yıl 2019, Cilt: 34 Sayı: 2, 473 - 486, 30.04.2019

Nagihan Boztunç Öztürk Melek Gülşah Şahin

Öz

Forming
an item pool for computerized adaptive testing requires a long and demanding process
that may be challenging, both in terms of time and cost. Therefore, one may come
across such questions as ‘How should an optimal item pool be?’ and/or ‘How many
items should exist in an item pool?’ Although research with regard to the features
to exist in an optimal item pool vary, there has been no consensus reached about
how big the item pool size should be. In the current study, the effect of different
item pool size and item distribution on ability estimation and item pool utilization
was analysed. 36 different item pools were generated through SimulCAT software.
Using 1,000 simulees, single session CAT environments were simulated and two different
termination rules were used in the study. Findings of the study indicated that as
the size of the item pool increased to a specific size, the precision of measurement
increased and the number of unused items decreased. By examining the results according
to b parameter, it was found that the
effect of b parameter distribution over
the results decreased.

Anahtar Kelimeler

Computerized adaptive testing, item pool quality, item pool size, item pool utilization, ability estimation

Kaynakça

Ariel, A., van der Linden, W. J., & Veldkamp, B. P. (2006). A strategy for optimizing item-pool management. Journal of Educational Measurement, 43(2), 85-96.
Bergstrom, B. A., & Lunz, M. E. (1999). CAT for certification and licensure. In F. Drasgow, & J. B. Olson-Buchanan (Eds.), Innovations in computerized assessment (pp. 67-91). Mahwah, NJ: Lawrence Erlbaum.
Boyd, A., Dodd, B., & Choi, S. (2010). Polytomous models in computerized adaptive testing. In M. L. Nering, & R. Ostini (Eds.), Handbook of polytomous item response theory models (pp. 229-255). New York: Routledge.
Boyd, M. A. (2003). Strategies for controlling testlet exposure rates in computerized adaptive testing systems (Unpublished Doctoral dissertation). The University of Texas, Austin.
Chen, S.Y., Ankenmann, R.D., & Spray, J.A. (2003). The relationship between item exposure and test overlap in computerized adaptive testing. Journal of Educational Measurement, 40(2), 129-145.
Davis, L. L. (2002). Strategies for controlling item exposure in computerized adaptive testing with polytomously scored items (Unpublished Doctoral dissertation). The University of Texas at Austin, Austin.
Flaugher, R. (2000). Item pools. In H. Wainer (Ed.), Computerized adaptive testing: A primer (pp. 37-59). Mahwah, NJ: Lawrence Erlbaum.
Gorin, J. S., Dodd, B. G., Fitzpatrick, S. J., & Shieh, Y. (2005). Computerized adaptive testing with the partial credit model: Estimation procedure, population distributions, and item pool characteristics. Applied Psychological Measurement, 29(6), 433-456.
Gu, L., & Reckase, M. D. (2007). Designing optimal item pools for computerized adaptive tests with sympson-Hetter exposure control. Proceedings of the 2007 GMAC Conference on Computerized Adaptive Testing.
Hambleton, R.K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of Item Response Theory. New York: Sage publication.
Han, K. T. (2011). User’s manual: SimulCAT. Retrieved 01 November 2017 from http://www.umass.edu/remp/software/simcata/simulcat/SimulCAT_Manual.pdf.
Han, K. T. (2012). An efficiency balanced information criterion for item selection in computerized adaptive testing. Journal of Educational Measurement, 49(3), 225-246.
He, W., & Reckase, M. D. (2013). Item pool design for an operational variable length computerized adaptive test. Educational and Psychological Measurement, 74(3), 473-494.
Jacobsen, J., Ackermann, R., Eguez, J., Ganguli, D., Rickard, P., & Taylor, L. (2011). Design of a computer-adaptive test to measure English literacy and numeracy in the Singapore workforce: considerations, benefits, and implications. Journal of Applied Testing Technology, 12(SI), 1-26.
Kingsbury, G. G., & Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2(4), 359-375.
Millman, J., & Arter, J. A. (1984). Issues in item banking. Journal of Educational Measurement, 21(4), 315-330.
Parshall, C., Spray, J., Kalohn, J., & Davey, T. (2002). Practical considerations in computer-based testing. New York: Springer Verlag.
Reckase, M. D. (1989). Adaptive testing: The evolution of a good idea. Educational measurement: Issues and practice, 8(3), 11-15.
Reckase, M. D. (2010). Designing item pools to optimize functioning of a computerized adaptive test. Psychological Test and Assessment Modeling, 52(2), 127-141.
Segall, D. O. (2004). Computerized Adaptive Testing. Encyclopaedia of Social Measurement, Academic Press. Retrieved from http://iacat.org/sites/default/files/biblio/se04-01.pdf.
Stocking, M. L. (1994). Three practical issues for modern adaptive testing item pools (ETS Research Report No. 93-2). Educational Testing Service: Princeton, NJ.
Thissen, D., & Mislevy, R. J. (2000). Testing algorithms. In H. Wainer (Ed.), Computerized Adaptive Testing: A Primer (2nd ed., pp. 101-135). London: Routledge.
Thompson, N. A., & Weiss, D. J. (2011). A Framework for the development of computerized adaptive tests. Practical Assessment, Research & Evaluation, 16(1).
Urry, V. W. (1977). Tailored testing: a successful application of latent trait theory. Journal of Educational Measurement, 14(2), 181-196.
van der Linden, W. J., & Glas, C. A. V. (2002). Computerized adaptive testing: theory and practice. USA: Kluwer Academic.
Veldkamp, B. P., & van der Linden, W. P. (2010). Designing item pools for adaptive testing. In W. J. van der Linden, & C. A. Glas (Eds.), Elements of Adaptive Testing (pp. 231-245). New York: Springer.
Wang, T., & Kolen, M. J. (2001). Evaluating comparability in computerized adaptive testing: Issues, criteria and example. Journal of Educational Measurement, 38(1), 19-49.
Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Educational Measurement, 35(2), 109-135.
Weiss, D. J. (1983). New Horizons in Testing. New York: Academic Press.
Weiss, D. J. (1985). Adaptive testing by computer. Journal of Consulting and Clinical Psychology, 53(6), 774-789.
Wise, S. L. (1997). An Evaluation of the Item Pools Used for Computerized Adaptive Test Versions of The Maryland Functional Tests. A Report Prepared for the Assessment Branch of the Maryland State Department of Education. Retrieved 10 March 2017 from https://marces.org/mdarch/pdf/M032045.pdf.
Xing, D., & Hambleton, R. K. (2004). Impacts of test design, item quality, and item bank size on the psychometric properties of computer-based credentialing examinations. Educational and Psychological Measurement, 64(1), 5-21.
Zhou, X., & Reckase, M. D. (2014). Optimal item pool design for computerized adaptive tests with polytomous items using GPCM. Psychological Test and Assessment Modeling, 56(3), 255-274.

Toplam 33 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Bölüm	Makaleler
Yazarlar	Nagihan Boztunç Öztürk 0000-0002-2777-5311 Melek Gülşah Şahin 0000-0001-5139-9777
Yayımlanma Tarihi	30 Nisan 2019
Yayımlandığı Sayı	Yıl 2019 Cilt: 34 Sayı: 2

Kaynak Göster

APA	Boztunç Öztürk, N., & Şahin, M. G. (2019). Effects of Item Pool Characteristics on Ability Estimate and Item Pool Utilization: A Simulation Study. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 34(2), 473-486.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin