Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory

Ebru Doğruöz; Çiğdem Akın Arıkan

doi:10.9779/pauefd.585774

TR EN

Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory

Öz

This research analyzed the two-category Item Response Theory (IRT) models as part of different ability estimation methods. The research was carried out in consideration of responses to 20 items under the Mathematics subtest of TEOG (National Transition from Primary to Secondary Education) exam by the 8th-grade students in 2015-2016. The study group consisted of 400 students who were randomly selected from the students participated in the TEOG exam. Ability estimations and standard error values for these estimations were calculated based on the data. These estimations were compared by two-way analysis of variance (ANOVA) for repeated measurements According to the research findings; it was revealed that the four-parameter logistic (4PL) item model fit better. In terms of ability estimation methods, the accuracy of Weighted Likelihood Estimation (WLE) was higher than Maximum A Posteriori (MAP) and Expected A Posteriori (EAP). WLE and MAP ability estimation model gave lower standard error values compared to the 4PL and 3PL model, respectively. The highest marginal reliability coefficient value for the 3PL model was calculated using estimations made according to MAP while estimations made according to WLE were used for the 4PL model. According to the research findings, it was concluded that the accuracy of ability scores obtained by the WLE estimation method under the 4PL model was higher

Anahtar Kelimeler

ability estimation methods,item response theory,3 PLM,4PLM

Kaynakça

Baker, F. B. (1992). Item Response Theory: Parameter Estimation Technique. New York: Marcel Dekker.
Bar-Hillel, M., Budescu, D., & Attali, Y. (2005). Scoring and keying multiple choice tests: A case study in irrationality. Mind & Society, 4, 3-12. http://doi.org/cp7ddc
Barton, M. A., & Lord, F. M. (1981). An upper asymptote for the three-parameter logistic item-response model. Research Bulletin, 81-20. Princeton, NJ: Educational Testing Service.
Baykul, Y. (1979). Örtük özellikler ve klasik test kuramları üzerine bir karşılaştırma (Unpublished Doctoral thesis). Hacettepe University, Graduate School of Social Sciences, Ankara.
Berberoğlu, G. (1988). Seçme amacıyla kullanılan testlerde Rasch modelinin katkıları (Unpublished Doctoral thesis). Hacettepe University, Graduate School of Social Sciences, Ankara.
Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. F. M. Lord & M. R. Novick(Ed), Statistical theories of mental test scores içinde (pp. 397-472). Reading MA: Addison-Wesley.
Borgatto, A. F., Azevedo, C. L. N., Pinheiro, A., & Andrade, D. F. (2015). Comparison of ability estimation methods using irt for test with different degrees of difficulty. Communications in Statistics-Simulation and Computation, 44(2), 474-488.
Ching-Fung, B. S. (2002). Ability estimation under different item parametrization and scoring models (Unpublished Doctoral thesis). North Teksas University, Teksas.
Can, S. (2003). The analyses of secondary education institutions student selection and placement test’s verbal section with respect to item response theory models (Unpublished Master's thesis). Middle East Technical University, Graduate School of Social Sciences, Ankara.
Chalmers R. P. (2013). mirt: Multidimensional Item Response Theory. R package version 0.9.0, [Çevirim içi: http://CRAN.R-project.org/package=mirt].

Cole, D. A. (1987). Utility of confirmatory factor analysis in test validation research. Journal of Consulting and Clinical Psychology, 55, 584-594.
Çelik, D. (2001). The Fit of one, two and three-parameter models of item response theory (IRT) to the ministry of National Education secondary school institutions student selection and placement test data (Unpublished Master’s thesis). Middle East Technical University, Graduate School of Social Sciences, Ankara.
Çetin, B. ve Çelikten, S. (2016). Nominal response model altında yetenek kestirim yöntemlerinin karşılaştırılması. International Engineering, Science and Education Conference, 01-03 December 2016, Diyarbakır.
DeMars, C. (2010). Item response theory. New York: Oxford University Press.
De Ayala, R. J. (2009). The Theory and Practice of Item Response Theory. U. S. A. Erdemir, A. (2015). Bir, iki, üç ve dört parametreli lojistik madde tepki kuramı modellerinin karşılaştırılması (Comparison of 1PL, 2PL, 3PL and 4PL item response theory models) (Unpublished Master's thesis). Gazi University, Graduate School of Educational Sciences, Ankara.
Finch, W. Holmes, & French, Brian F. (2012). Parameter Estimation with Mixture Item Response Theory Models: A Monte Carlo Comparison of Maximum Likelihood and Bayesian Methods. Journal of Modern Applied Statistical Methods, 11(1), Article 14. DOI: 10.22237/jmasm/1335845580.
Gardner-Medwin, A. R., & Gahan, M. (2003). Formative and summative confidence-based assessment. In J. Christie (Ed.), Proceedings of the 7th International Computer-Aided Assessment Conference (pp.147-155). Loughborough, UK: Loughborough University.
Hambleton, R. K., & Swaminathan, H. (1985). Item Response Theory: Principles and Applications. Boston: Kluwer Nijhoff.
Hockemeyer, C. (2002). A comparison of non-deterministic procedures for the adaptive assessment of knowledge. Psychologische Beiträge, 44, 495-503.
Kılıç, İ. (1999). The fit of one- two- and three- parameter models of item response theory to the student selection test of the student selection and placement center (Unpublished Doctoral thesis). Middle East Technical University, Graduate School of Social Sciences, Ankara.
Kline, R. B. (2005). Principles and practices of structural equation modeling. New York: The Guildord.
Liao, W., Ho, R., & Yen, Y. (2012). The Four-Parameter Logistic Item Response Theory Model as a Robust Method of Estimating Ability Despite Aberrant Responses. Social Behavior and Personality, 40(10), 1679-1694.
Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63, 509-525.
Önder, İ. (2007). An investigation of goodness of model data fit. Hacettepe University Journal of Education, 32, 210-220.
Reise, S. P., & Waller, N. G. (2003). How many IRT parameters does it take to model psychopathology items? Psychological Methods 8(2), 164-184.
Reynolds, T., Perkins, K., & Brutten, S. 1994. A comparative item analysis study of a language testing instrument. Language Testing, 11, 1-14.
Rose, N. (2010). Maximum Likelihood and Bayes Modal Ability Estimation in Two-Parametric IRT Models: Derivations and Implementation.
Rulison, K., & Loken, E. (2009). I’ve fallen and I can’t get up: Can high-ability students recover from early mistakes in CAT? Applied Psychological Measurement, 33, 83-101. http://doi.org/dtqjq8
Samejima, E. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika Monograph, 17.
Seong, T. J., Kim, S. H., & Cohen, A. S. (1997). A comparison of procedures for ability estimation under the graded response model. Paper presented at the Annual Meeting of the American Educational Research Association, Chicago.
Taşdelen Teker, G., Kelecioğlu, H. ve Eroğlu, M. G. (2013). An investigation of goodness of model data fit. 4. International Conference on New Horizons in Education, June, 25-27, 2013, Roma, Italia.
Wainer, H., & Thissen, D. (1987). Estimating ability with the wrong model. Journal of Educational Statistics, 12, 339-368.
Wang, T., & Vispoel, W. P. (1998). Properties of ability estimation methods in computerized adaptive testing. Journal of Education Measurement, 35, 109-135.
Wang, S., & Wang, T. (2001). Precision of Warm’s weighted likelihood estimates for a polytomous model in computarized adaptive testing. Applied Psychological Measurement, 25, 317-331.
Warm, T. A. (1989).Weighted likelihood estimation of ability in item response theory. Psychometrika 54, 427-450.
Yalçın, S. (2018). Data Fit Comparison of Mixture Item Response Theory Models and Traditional Models. International Journal of Assessment Tools in Education, 5(2), 301-313 DOI:10.21449/ijate.402806.
Yapar, T. (2003). A study of the predictive validity of the Başkent a study of the predictive validity of the Başkent University English proficiency exam through the use of the two-parameter irt model’s abiliıty estimates (Unpublished Master’s thesis). Middle East Technical University, Graduate School of Social Sciences, Ankara.
Yeğin, O. P. (2003). The predictive validity of Başkent University proficiency exam (buepe) through the use of the three-parameter irt model’s ability estimates (Unpublished Master’s thesis). Middle East Technical University, Graduate School of Social Sciences, Ankara.
Yen, Y.-C., Ho, R.-G., Chen, L.-J., Chou, K.-Y., & Chen, Y.-L. (2010). Development and evaluation of a confidence-weighting computerized adaptive testing. Educational Technology & Society, 13, 163-176.
Yen, Y., Ho, R., Liao, W., & Chen, L. (2012). Reducing the Impact of Inappropriate Items on Reviewable Computerized Adaptive Testing. Educational Technology & Society, 15, 231–243.
Zwinderman, A. H., & van den Wollenberg, A. L. (1990). Robustness of marginal maximum likelihood estimation in the rasch model. Applied Psychological Measurement, 14(1), 73–81.

Ayrıntılar

Birincil Dil

İngilizce

Konular

-

Bölüm

Araştırma Makalesi

Yazarlar

Ebru Doğruöz ^*
0000-0001-6572-274X
Türkiye

Çiğdem Akın Arıkan
0000-0001-5255-8792
Türkiye

Yayımlanma Tarihi

1 Eylül 2020

Gönderilme Tarihi

2 Temmuz 2019

Kabul Tarihi

5 Şubat 2020

Yayımlandığı Sayı

Yıl 2020 Sayı: 50

DOI

https://doi.org/10.9779/pauefd.585774

IZ

https://izlik.org/JA83UP66UN

APA

Doğruöz, E., & Akın Arıkan, Ç. (2020). Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory. Pamukkale Üniversitesi Eğitim Fakültesi Dergisi, 50, 50-69. https://doi.org/10.9779/pauefd.585774

AMA

1.Doğruöz E, Akın Arıkan Ç. Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory. PAÜEFD. 2020;(50):50-69. doi:10.9779/pauefd.585774

Chicago

Doğruöz, Ebru, ve Çiğdem Akın Arıkan. 2020. “Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory”. Pamukkale Üniversitesi Eğitim Fakültesi Dergisi, sy 50: 50-69. https://doi.org/10.9779/pauefd.585774.

EndNote

Doğruöz E, Akın Arıkan Ç (01 Eylül 2020) Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory. Pamukkale Üniversitesi Eğitim Fakültesi Dergisi 50 50–69.

IEEE

[1]E. Doğruöz ve Ç. Akın Arıkan, “Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory”, PAÜEFD, sy 50, ss. 50–69, Eyl. 2020, doi: 10.9779/pauefd.585774.

ISNAD

Doğruöz, Ebru - Akın Arıkan, Çiğdem. “Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory”. Pamukkale Üniversitesi Eğitim Fakültesi Dergisi. 50 (01 Eylül 2020): 50-69. https://doi.org/10.9779/pauefd.585774.

JAMA

1.Doğruöz E, Akın Arıkan Ç. Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory. PAÜEFD. 2020;:50–69.

MLA

Doğruöz, Ebru, ve Çiğdem Akın Arıkan. “Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory”. Pamukkale Üniversitesi Eğitim Fakültesi Dergisi, sy 50, Eylül 2020, ss. 50-69, doi:10.9779/pauefd.585774.

Vancouver

1.Ebru Doğruöz, Çiğdem Akın Arıkan. Comparison of Different Ability Estimation Methods Based on 3 and 4PL Item Response Theory. PAÜEFD. 01 Eylül 2020;(50):50-69. doi:10.9779/pauefd.585774