Araştırma Makalesi
BibTex RIS Kaynak Göster

Investigating the Performance of Item Selection Algorithms in terms of Measurement Accuracy in CD-CAT

Yıl 2022, Sayı: 54, 188 - 214, 02.01.2022
https://doi.org/10.9779/pauefd.769548

Öz

The aim of this study is to examine the performance of item selection algorithms according to the accuracy of measuring through different number of attributes, item quality and test lengths for DINA and DINO models in cognitive diagnostic computerized adaptive testing (CD-CAT). Within the scope of the study, the number of attributes was manipulated as 5 and 8, and each item was limited to measure at least one attribute and at most 4 attributes. In data generation, the g and s parameters were drawn from the uniform distribution of U(0.05-0.25) for high item quality level and U (0.10-0.30) for low item quality level. Cognitive patterns of 3000 examinees were generated so that each examinee had a 50% chance of having each attribute. Fixed test lengths of 8, 16 and 24 were used as termination rules. GDI, JSD, MI, PWCDI and PWKL were used as item selection algorithms in the study. The performances of item selection algorithms were evaluated according to their attribute and pattern recovery rates. Data generation and analysis in the study were carried out using R 3.6.3 software. As a result of the study, it was determined that the measurement accuracy values of all algorithms increased as the item quality and test length increased, and the measurement accuracy decreased as the number of attributes increased. It was found that the measurement accuracy of the JSD algorithm was the highest in all conditions, while the PWKL algorithm was the lowest. While the performance of the algorithms except for PWKL algorithm in DINA and DINO models was approximately the same, it was found that the measurement accuracy of the PWKL algorithm in the DINO model was lower than that of the DINA model.

Kaynakça

  • Cheng, Y. (2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74, 619–632.
  • de la Torre, J. (2009). A cognitive diagnosis model for cognitively-based multiple-choice options. Applied Psychological Measurement, 33, 163–183.
  • DiBello, L., Roussos, L. A., ve Stout, W. F. (2007). Handbook of Statistics. C. R. Rao ve S. Sinharay (Ed). Review of Cognitively Diagnostic Assessment and a Summary of Psychometric Models. 26, 979-1030.
  • Embretson, S.E. (2001). The second century of ability testing: Some predictions and speculations. Retrievable at http://www.ets.org/Media/Research/pdf/PICANG7.pdf.
  • Haertel, E.H. (1984). An application of latent class models to assessment data. Applied Psychological Measurement, 8, 333–346.
  • Hsu, C. L., ve Wang, W. C. (2015). Variable-length computerized adaptive testing using the higher order DINA model. Journal of Educational Measurement, 52, 125–143.
  • Junker, B. W. ve Sijtsma, K. (2001). Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory. Applied Psychological Measurement, 25(3), 258–272.
  • Kaplan, M., de la Torre, J. ve Barrada, J. R. (2015). New item selection methods for cognitive diagnosis computerized adaptive testing. Applied Psychological Measurement, 39, 167–188.
  • Minchen, N. D., ve de la Torre, J. (2016, July). The continuous G-DINA model and the Jensen-Shannon divergence. Paper presented at the International Meeting of the Psychometric Society, Asheville, NC.
  • Rupp, A. A., Henson, R. A. ve Templin, J. L. (2010) Diagnostic Measurement : Theory, Methods, and Applications. Guilford Press Stocking, M.L. (1994). Three practical issues for modern adaptive testing item pools (ETS Research Rep. No. 94-5). Princeton: Educational Testing Service.
  • Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society, 51, 337–350.
  • Tatsuoka, C. ve Ferguson, T. (2003). Sequential classification on partially ordered sets. Journal of Royal Statistics, 65, 143–157.
  • Templin, J. L., ve Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287-305.
  • Thissen, D. ve Mislevy, R. J. (2000). Computerized Adaptive Testing: A primer. H. Wainer, (Ed). Testing algorithms, Mahwah, NH: Lawrence Erlbaum Associates, Inc, s. 101-133.
  • von Davier, M. (2005). A general diagnostic model applied to language testing data (Research Report 05-16). Princeton, NJ: Educational Testing Service.
  • Wang, C. (2013). Mutual Information Item Selection Method in Cognitive Diagnostic Computerized Adaptive Testing With Short Test Length. Educational and Psychological Measurement, 73(6), 1017–1035.
  • Xu, X., Chang, H. ve Douglas, J. (2003). Computerized adaptive testing strategies for cognitive diagnosis. Paper presented at the annual meeting of National Council on Measurement in Education, Montreal, Canada.
  • Yigit, H. D., Sorrel, M. A. ve de la Torre, J. (2019). Computerized Adaptive Testing for Cognitively Based Multiple-Choice Data. Applied Psychological Measurement, 43(5), 388–401.
  • Zheng, C. ve Chang, H.-H. (2016). High-efficiency response distribution–based item selection algorithms for short-length cognitive diagnostic computerized adaptive testing. Applied Psychological Measurement, 40, 608-624.

BT-BOBUT Uygulamalarında Madde Seçim Algoritmalarının Performanslarının Ölçme Doğruluğu Açısından İncelenmesi

Yıl 2022, Sayı: 54, 188 - 214, 02.01.2022
https://doi.org/10.9779/pauefd.769548

Öz

Bu çalışmanın amacı, Bilişsel Tanıya Dayalı Bilgisayar Ortamında Bireye Uyarlanmış Testlerde (BT-BOBUT), DINA ve DINO model için farklı nitelik sayısında, madde kalitesinde ve test uzunluklarında madde seçim algoritmalarının performanslarını ölçme doğruluğuna göre incelemektir. Çalışma kapsamında, nitelik sayısı 5 ve 8 olarak değişimlenmiş ve her madde en az bir nitelik ve en fazla 4 nitelik ölçecek şekilde sınırlandırılmıştır. Veri üretiminde, g ve s parametreleri yüksek madde kalite düzeyi için U(0,05-0,25) ve düşük madde kalite düzeyi için U(0.10-0.30) tekbiçimli dağılımdan çekilmiştir. Her bireyin her niteliğe sahip olma şansı %50 olacak şekilde 3000 bireye ait bilişsel örüntüler üretilmiştir. Sonlandırma kuralı olarak 8, 16 ve 24 sabit test uzunlukları kullanılmıştır. Çalışmada kullanılan madde seçim algoritmaları GDI, JSD, MI, PWCDI ve PWKL’dir. Madde seçim algoritmalarının performansları, nitelik ve örüntü koruma oranlarına göre değerlendirilmiştir. Çalışmada veri üretimi ve analizleri R 3.6.3 yazılımı kullanılarak gerçekleştirilmiştir. Çalışma sonucunda, madde kalitesi ve test uzunluğu arttıkça tüm algoritmaların ölçme doğruluk değerlerinin arttığı, nitelik sayısı arttıkça ölçme doğruluğunun azaldığı tespit edilmiştir. JSD algoritmasının ölçme doğruluğu tüm koşullarda en yüksek iken, PWKL algoritmasının en düşük olduğu bulunmuştur. DINA ve DINO modellerde PWKL algoritması dışındaki algoritmaların performansı yaklaşık aynı iken, DINO modelde PWKL algoritmasının ölçme doğruluğunun DINA modelden daha düşük olduğu bulgusu elde edilmiştir.

Kaynakça

  • Cheng, Y. (2009). When cognitive diagnosis meets computerized adaptive testing: CD-CAT. Psychometrika, 74, 619–632.
  • de la Torre, J. (2009). A cognitive diagnosis model for cognitively-based multiple-choice options. Applied Psychological Measurement, 33, 163–183.
  • DiBello, L., Roussos, L. A., ve Stout, W. F. (2007). Handbook of Statistics. C. R. Rao ve S. Sinharay (Ed). Review of Cognitively Diagnostic Assessment and a Summary of Psychometric Models. 26, 979-1030.
  • Embretson, S.E. (2001). The second century of ability testing: Some predictions and speculations. Retrievable at http://www.ets.org/Media/Research/pdf/PICANG7.pdf.
  • Haertel, E.H. (1984). An application of latent class models to assessment data. Applied Psychological Measurement, 8, 333–346.
  • Hsu, C. L., ve Wang, W. C. (2015). Variable-length computerized adaptive testing using the higher order DINA model. Journal of Educational Measurement, 52, 125–143.
  • Junker, B. W. ve Sijtsma, K. (2001). Cognitive Assessment Models with Few Assumptions, and Connections with Nonparametric Item Response Theory. Applied Psychological Measurement, 25(3), 258–272.
  • Kaplan, M., de la Torre, J. ve Barrada, J. R. (2015). New item selection methods for cognitive diagnosis computerized adaptive testing. Applied Psychological Measurement, 39, 167–188.
  • Minchen, N. D., ve de la Torre, J. (2016, July). The continuous G-DINA model and the Jensen-Shannon divergence. Paper presented at the International Meeting of the Psychometric Society, Asheville, NC.
  • Rupp, A. A., Henson, R. A. ve Templin, J. L. (2010) Diagnostic Measurement : Theory, Methods, and Applications. Guilford Press Stocking, M.L. (1994). Three practical issues for modern adaptive testing item pools (ETS Research Rep. No. 94-5). Princeton: Educational Testing Service.
  • Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society, 51, 337–350.
  • Tatsuoka, C. ve Ferguson, T. (2003). Sequential classification on partially ordered sets. Journal of Royal Statistics, 65, 143–157.
  • Templin, J. L., ve Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11, 287-305.
  • Thissen, D. ve Mislevy, R. J. (2000). Computerized Adaptive Testing: A primer. H. Wainer, (Ed). Testing algorithms, Mahwah, NH: Lawrence Erlbaum Associates, Inc, s. 101-133.
  • von Davier, M. (2005). A general diagnostic model applied to language testing data (Research Report 05-16). Princeton, NJ: Educational Testing Service.
  • Wang, C. (2013). Mutual Information Item Selection Method in Cognitive Diagnostic Computerized Adaptive Testing With Short Test Length. Educational and Psychological Measurement, 73(6), 1017–1035.
  • Xu, X., Chang, H. ve Douglas, J. (2003). Computerized adaptive testing strategies for cognitive diagnosis. Paper presented at the annual meeting of National Council on Measurement in Education, Montreal, Canada.
  • Yigit, H. D., Sorrel, M. A. ve de la Torre, J. (2019). Computerized Adaptive Testing for Cognitively Based Multiple-Choice Data. Applied Psychological Measurement, 43(5), 388–401.
  • Zheng, C. ve Chang, H.-H. (2016). High-efficiency response distribution–based item selection algorithms for short-length cognitive diagnostic computerized adaptive testing. Applied Psychological Measurement, 40, 608-624.
Toplam 19 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Bölüm Makaleler
Yazarlar

Semih Aşiret 0000-0002-0577-2603

Seçil Ömür Sünbül 0000-0001-9442-1516

Yayımlanma Tarihi 2 Ocak 2022
Gönderilme Tarihi 15 Temmuz 2020
Kabul Tarihi 23 Ağustos 2021
Yayımlandığı Sayı Yıl 2022 Sayı: 54

Kaynak Göster

APA Aşiret, S., & Ömür Sünbül, S. (2022). BT-BOBUT Uygulamalarında Madde Seçim Algoritmalarının Performanslarının Ölçme Doğruluğu Açısından İncelenmesi. Pamukkale Üniversitesi Eğitim Fakültesi Dergisi(54), 188-214. https://doi.org/10.9779/pauefd.769548