Araştırma Makalesi
BibTex RIS Kaynak Göster

Bilgisayarda Bireyselleştirilmiş Sınıflama Testinde Çok Kategorili Sınıflama İçin Sınıflama Koşullarının İncelenmesi

Yıl 2024, , 63 - 85, 30.04.2024
https://doi.org/10.19171/uefad.1357800

Öz

Bu çalışmada R programlama dili ile çok kategorili sınıflama için Bilgisayarda Bireyselleştirilmiş Sınıflama Testi (BBST) kullanıldığında test etkililiğinin ve ölçme kesinliğinin sınıflama kriterleri, madde seçme yöntemleri, yetenek kestirim yöntemleri ve iki, üç, dört kategorili sınıflama kategori sayısı ile nasıl değiştiği araştırılmıştır. Simülasyonla iki kategorili, tek boyutlu 500 madde ve 1000 kişilik veri ile. 36 koşul belirlenmiştir. Tüm koşullar için 25 tekrarın ortalaması alınmıştır. Araştırma sonunda sınıflama kategori sayısı arttıkça Ortalama Test Uzunluğunun (OTU) arttığı, Ortalama Sınıflama Doğruluğu (OSD) azaldığı görülmüştür. Ortalama Hatanın Karekökü (RMSE), Ortalama Mutlak Hata (OMH), Yanlılık ve Gerçek Yetenekler ile Kestirilen Yetenekler Arasındaki Korelasyon (r) değerlerinin azaldığı anlaşılmıştır. OTU için Güven Aralığı (GA) sınıflama kriteri OSD, yanlılık, korelasyon, OMH için Ardışık Olasılık Oran Testi (AOOT) sınıflama kriterinin performansının daha etkili olduğu görülmüştür. Genelleştirilmiş Olabilirlik Oran (GOO) sınıflama kriterinin OTU bakımından GA kriterine benzer sonuçlar, mutlak hata yönünden ise AOOT sınıflama kriteri ile benzer sonuçlar oluşturduğu görülmüştür. Yetenek kestirim yöntemleri OSD ve OTU açısından benzer performans göstermiştir. Kesme Noktası (KN) temelli madde seçme yöntemleri Kestirilen Yetenek (KY) temelli madde seçme yöntemlerine göre test etkililiği ve ölçme kesinliği açısından daha etkili performans gösterdiği belirlenmiştir.

Kaynakça

  • Arce-Ferrer, A., Frisbie, D. A., & Kolen, M. J. (2002). Standard errors of proportions used in Reporting changes in school performance with achievement levels. Educational Assessment, 8(1), 59-75.
  • Demir, S. (2019). Bireyselleştirilmiş bilgisayarlı sınıflama testlerinde sınıflama doğruluğunun incelenmesi [Investigation of classification accuracy in individualied computerized classification tests] (Yayın No. 600532) [Doktora tezi, Hacettepe Üniversitesi]. YÖK. https://tez.yok.gov.tr/UlusalTezMerkezi/
  • Eckes, T. (2017). Rater effects: Advances in item response modeling of human ratings–Part I. Psychological Test and Assessment Modeling, 59(4), 443-452.
  • Eggen, T. J. H. M. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249-261. https://doi.org/10.1177/01466219922031365
  • Eggen, T. J. H. M., & Straetmans, G. J. J. M. (2000). Computerized adaptive testing for classifying examinees into three categories. Educational and Psychological Measurement, 60(5), 713-734. https://doi.org/10.1177/00131640021970862
  • Gündeğer, C. (2017). Bireyselleştirilmiş bilgisayarlı sınıflama testi kriterlerinin sınıflama doğruluğu ve test uzunluğu açısından karşılaştırılması [Comparison of adaptive computerized classification test criteria in terms of classification accuracy and test length] (Yayın No. 483376) [Doktora tezi, Hacettepe Üniversitesi]. YÖK. https://tez.yok.gov.tr/UlusalTezMerkezi/
  • Haring, S. H. (2014). A comparison of three statistical testing procedures for computerized classification testing with multiple cutscores and item selection methods. (Doctoral dissertation, University of Texas at Austin). http://hdl.handle.net/2152/24838
  • Kaptan, S. (1995). Bilimsel araştırma teknikleri ve istatistik teknikleri. Rehber Yayınevi.
  • Kingsbury, G. G., & Weiss, D.J. (1983). A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing, (pp. 237-254). Academic Press.
  • Lau, C. A. (1996). Robustness of a unidimensional computerized testing mastery procedure with multidimensional testing data. (Doctoral Dissertation, The University of Iowa).
  • Lewis, C., & Sheehan, K. (1990). Using Bayesian decision theory to design a computerized mastery test. Applied Psychological Measurement, 14, 367-386.
  • Lin, C. J., & Spray, J. (2000). Effects of item-selection criteria on classification testing with the sequential probability ratio test. ACT (Research Report 2000-8). Iowa city, IA: ACT Research Report Series. https://eric.ed.gov/?id=ED445066
  • Nydick, S. W., Nozawa, Y., & Zhu, R. (2012, Nisan). Accuracy and efficiency in classifying examinees using computerized adaptive tests: An application to a large scale test. The National Council on Measurement in Education (NCME) toplantısında sunulan bildiri, Vancouver, BritishColumbia, Canada. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.476.3381&rep=re p1&type=pdf
  • Nydick, S. W. (2013). Multidimensional mastery testing with CAT. (Doctoral Dissertation, the University of Minnesota). Available from ProOuest Dissertations and Theses database. (UMI No. 3607925)
  • Nydick, S. W. (2014). catirt: An R Package for Simulating IRT-Based Computerized Adaptive Tests. https://cran.rproject.org/web/packages/catIrt/catIrt.pdf
  • Reckase, M. D. (1983). A procedure for decision making using tailored testing. In D. J. Weiss (Ed.). New horizonsin testing: latent trait theory and computerized adaptive testing. Academic Press.
  • R Core Team (2013). R: A language and environment for statistical computing, (Version 3.0.1) [Computer software], Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.Rproject.org/
  • Spray, J. A. & Reckase, M. D. (1994). The Selection of Test Items for Decision Making with a Computer Adaptive Test. The Annual Meeting of the National Council on Measurement in Education. NewOrleans, LA, 5-7 April 1994. https://eric.ed.gov/?id=ED372078
  • Spray, J. A., & Reckase, M. D. (1996). Comparison of SPRT and sequential bayes procedures for classifying examinees into two categories using a computerized test. Journal of Educational and Behavioral Statistics, 21(4), 405-414. https://doi.org/10.3102/10769986021004405
  • Thompson, N. A. (2007). A comparison of two methods of polytomous computerized classification testing for multiple cutscores Doctoral dissertation, University of Minnesota
  • Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), 778-793. https://doi.org/10.1177/0013164408324460
  • Thompson, N. A. (2011). Termination criteria for computerized classification testing. Practical Assessment, Research & Evaluation, 16(4), 1-7. https://doi.org/10.7275/wq8m-zk25
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450. https://doi.org/10.1007/BF02294627
  • Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361-375. https://doi.org/10.1111/j.1745-3984.1984.tb01040

Investigation of Classification Conditions For Multicategorical Classification in Computerized Adaptive Classification Test

Yıl 2024, , 63 - 85, 30.04.2024
https://doi.org/10.19171/uefad.1357800

Öz

This study used the Computerized Adaptıve Classification Test (CACT) for multi-category classification with R programming language to investigate how test effectiveness and measurement accuracy changed in terms of classification criteria, item selection methods, ability estimation methods, and two, three, and four-category classifications. With the simulation, two-category, one-dimensional 500 items and 1000-person data were created, 36 conditions were determined, and 25 repetitions were averaged for all conditions. Results showed that as the number of classification categories increased, the Average Test Length (ATL) increased and the Average Classification Accuracy (ACA) decreased. The Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Bias, and Correlation (r) values between real and estimated thetas (r) values were found to decrease. The performance of the Confidence Interval (CI) classification criterion for ATL, ACA, bias, correlation, and the Sequential Probability Ratio Test (SPRT) classification criterion for MAE were found to be more effective. Generalized Likelihood Ratio (GLR) classification criterion produced similarresults to the CI criterion in terms of ATL, and to the SPRT classification criterion in terms of absolute error. Ability estimation methods were similar in terms of ACA and ATL. Cutscore based (CB) item selection methods were more effective in terms of test effectiveness and measurement accuracy than Estimated Ability -Based (EB) item selection methods.

Kaynakça

  • Arce-Ferrer, A., Frisbie, D. A., & Kolen, M. J. (2002). Standard errors of proportions used in Reporting changes in school performance with achievement levels. Educational Assessment, 8(1), 59-75.
  • Demir, S. (2019). Bireyselleştirilmiş bilgisayarlı sınıflama testlerinde sınıflama doğruluğunun incelenmesi [Investigation of classification accuracy in individualied computerized classification tests] (Yayın No. 600532) [Doktora tezi, Hacettepe Üniversitesi]. YÖK. https://tez.yok.gov.tr/UlusalTezMerkezi/
  • Eckes, T. (2017). Rater effects: Advances in item response modeling of human ratings–Part I. Psychological Test and Assessment Modeling, 59(4), 443-452.
  • Eggen, T. J. H. M. (1999). Item selection in adaptive testing with the sequential probability ratio test. Applied Psychological Measurement, 23(3), 249-261. https://doi.org/10.1177/01466219922031365
  • Eggen, T. J. H. M., & Straetmans, G. J. J. M. (2000). Computerized adaptive testing for classifying examinees into three categories. Educational and Psychological Measurement, 60(5), 713-734. https://doi.org/10.1177/00131640021970862
  • Gündeğer, C. (2017). Bireyselleştirilmiş bilgisayarlı sınıflama testi kriterlerinin sınıflama doğruluğu ve test uzunluğu açısından karşılaştırılması [Comparison of adaptive computerized classification test criteria in terms of classification accuracy and test length] (Yayın No. 483376) [Doktora tezi, Hacettepe Üniversitesi]. YÖK. https://tez.yok.gov.tr/UlusalTezMerkezi/
  • Haring, S. H. (2014). A comparison of three statistical testing procedures for computerized classification testing with multiple cutscores and item selection methods. (Doctoral dissertation, University of Texas at Austin). http://hdl.handle.net/2152/24838
  • Kaptan, S. (1995). Bilimsel araştırma teknikleri ve istatistik teknikleri. Rehber Yayınevi.
  • Kingsbury, G. G., & Weiss, D.J. (1983). A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure. In D. J. Weiss (Ed.), New horizons in testing: Latent trait theory and computerized adaptive testing, (pp. 237-254). Academic Press.
  • Lau, C. A. (1996). Robustness of a unidimensional computerized testing mastery procedure with multidimensional testing data. (Doctoral Dissertation, The University of Iowa).
  • Lewis, C., & Sheehan, K. (1990). Using Bayesian decision theory to design a computerized mastery test. Applied Psychological Measurement, 14, 367-386.
  • Lin, C. J., & Spray, J. (2000). Effects of item-selection criteria on classification testing with the sequential probability ratio test. ACT (Research Report 2000-8). Iowa city, IA: ACT Research Report Series. https://eric.ed.gov/?id=ED445066
  • Nydick, S. W., Nozawa, Y., & Zhu, R. (2012, Nisan). Accuracy and efficiency in classifying examinees using computerized adaptive tests: An application to a large scale test. The National Council on Measurement in Education (NCME) toplantısında sunulan bildiri, Vancouver, BritishColumbia, Canada. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.476.3381&rep=re p1&type=pdf
  • Nydick, S. W. (2013). Multidimensional mastery testing with CAT. (Doctoral Dissertation, the University of Minnesota). Available from ProOuest Dissertations and Theses database. (UMI No. 3607925)
  • Nydick, S. W. (2014). catirt: An R Package for Simulating IRT-Based Computerized Adaptive Tests. https://cran.rproject.org/web/packages/catIrt/catIrt.pdf
  • Reckase, M. D. (1983). A procedure for decision making using tailored testing. In D. J. Weiss (Ed.). New horizonsin testing: latent trait theory and computerized adaptive testing. Academic Press.
  • R Core Team (2013). R: A language and environment for statistical computing, (Version 3.0.1) [Computer software], Vienna, Austria: R Foundation for Statistical Computing. Retrieved from http://www.Rproject.org/
  • Spray, J. A. & Reckase, M. D. (1994). The Selection of Test Items for Decision Making with a Computer Adaptive Test. The Annual Meeting of the National Council on Measurement in Education. NewOrleans, LA, 5-7 April 1994. https://eric.ed.gov/?id=ED372078
  • Spray, J. A., & Reckase, M. D. (1996). Comparison of SPRT and sequential bayes procedures for classifying examinees into two categories using a computerized test. Journal of Educational and Behavioral Statistics, 21(4), 405-414. https://doi.org/10.3102/10769986021004405
  • Thompson, N. A. (2007). A comparison of two methods of polytomous computerized classification testing for multiple cutscores Doctoral dissertation, University of Minnesota
  • Thompson, N. A. (2009). Item selection in computerized classification testing. Educational and Psychological Measurement, 69(5), 778-793. https://doi.org/10.1177/0013164408324460
  • Thompson, N. A. (2011). Termination criteria for computerized classification testing. Practical Assessment, Research & Evaluation, 16(4), 1-7. https://doi.org/10.7275/wq8m-zk25
  • Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450. https://doi.org/10.1007/BF02294627
  • Weiss, D. J., & Kingsbury, G. G. (1984). Application of computerized adaptive testing to educational problems. Journal of Educational Measurement, 21(4), 361-375. https://doi.org/10.1111/j.1745-3984.1984.tb01040
Toplam 24 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Bilgisayar Tabanlı Sınav Uygulamaları
Bölüm Makaleler
Yazarlar

Demet Alkan 0000-0002-1478-9183

Nuri Doğan 0000-0001-6274-2016

Yayımlanma Tarihi 30 Nisan 2024
Gönderilme Tarihi 9 Eylül 2023
Yayımlandığı Sayı Yıl 2024

Kaynak Göster

APA Alkan, D., & Doğan, N. (2024). Bilgisayarda Bireyselleştirilmiş Sınıflama Testinde Çok Kategorili Sınıflama İçin Sınıflama Koşullarının İncelenmesi. Journal of Uludag University Faculty of Education, 37(1), 63-85. https://doi.org/10.19171/uefad.1357800