Comparison of item response theory ability and item parameters according to classical and Bayesian estimation methods

Eray Selçuk; Ergül Demir

doi:10.21449/ijate.1290831

Araştırma Makalesi

BibTex

RIS

Kaynak Göster

Comparison of item response theory ability and item parameters according to classical and Bayesian estimation methods

Yıl 2024, Cilt: 11 Sayı: 2, 213 - 248, 20.06.2024

Eray Selçuk , Ergül Demir

https://doi.org/10.21449/ijate.1290831

Öz

This research aims to compare the ability and item parameter estimations of Item Response Theory according to Maximum likelihood and Bayesian approaches in different Monte Carlo simulation conditions. For this purpose, depending on the changes in the priori distribution type, sample size, test length, and logistics model, the ability and item parameters estimated according to the maximum likelihood and Bayesian method and the differences in the RMSE of these parameters were examined. The priori distribution (normal, left-skewed, right-skewed, leptokurtic, and platykurtic), test length (10, 20, 40), sample size (100, 500, 1000), logistics model (2PL, 3PL). The simulation conditions were performed with 100 replications. Mixed model ANOVA was performed to determine RMSE differentiations. The prior distribution type, test length, and estimation method in the differentiation of ability parameter and RMSE were estimated in 2PL models; the priori distribution type and test length were significant in the differences in the ability parameter and RMSE estimated in the 3PL model. While prior distribution type, sample size, and estimation method created a significant difference in the RMSE of the item discrimination parameter estimated in the 2PL model, none of the conditions created a significant difference in the RMSE of the item difficulty parameter. The priori distribution type, sample size, and estimation method in the item discrimination RMSE were estimated in the 3PL model; the a priori distribution and estimation method created significant differentiation in the RMSE of the lower asymptote parameter. However, none of the conditions significantly changed the RMSE of item difficulty parameters.

Anahtar Kelimeler

IRT parameter estimation, Maximum likelihood estimation, Bayesian estimation method, MCMC, RMSE

Etik Beyan

Ankara University, 04.11.2019, 13-339.

Kaynakça

Akour, M., & Al-Omari, H. (2013). Empirical investigation of the stability of IRT item-parameters estimation. International Online Journal of Educational Sciences, 5(2), 291-301.
Atar, B. (2007). Differential item functioning analyses for mixed response data using IRT likelihood-ratio test, logistic regression, and GLLAMM procedures [Unpublished doctoral dissertation, Florida State University]. http://purl.flvc.org/fsu/fd/FSU_migr_etd-0248
Baker, F.B. (2001). The basics of item response theory (2nd ed.). College Park, (MD): ERIC Clearinghouse on Assessment and Evaluation.
Baker, F.B., & Kim, S. (2004). Item response theory: Parameter estimation techniques (2nd ed.). Marcel Dekker.
Barış-Pekmezci, F. & Şengül-Avşar, A. (2021). A guide for more accurate and precise estimations in simulative unidimensional IRT models. International Journal of Assessment Tools in Education, 8(2), 423-453. https://doi.org/10.21449/ijate.790289
Bilir, M.K. (2009). Mixture item response theory-mimic model: Simultaneous estimation of differential item functioning for manifest groups and latent classes [Unpublished doctoral dissertation, Florida State University]. http://diginole.lib.fsu.edu/islandora/object/fsu:182011/datastream/PDF/view
Bock, R.D., & Mislevy, R.J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431 444. https://doi.org/10.1177/014662168200600405
Bulmer, M.G. (1979). Principles of statistics. Dover Publications.
Bulut, O. & Sünbül, Ö. (2017). R programlama dili ile madde tepki kuramında monte carlo simülasyon çalışmaları [Monte carlo simulation studies in item response theory with the R programming language]. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287. https://doi.org/10.21031/epod.305821
Chalmers, R.P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1 29. https://doi.org/10.18637/jss.v048.i06
Chuah, S.C., Drasgow F., & Luecht, R. (2006). How big is big enough? Sample size requirements for cast item parameter estimation. Applied Measurement in Education, 19(3), 241-255. https://doi.org/10.1207/s15324818ame1903_5
Clarke, E. (2022, December 22). ggbeeswarm: Categorical scatter (violin point) plots. https://cran.r-project.org/web/packages/ggbeeswarm/index.html
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates, Publishers.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Wadsworth Group.
Çelikten, S. & Çakan, M. (2019). Bayesian ve nonBayesian kestirim yöntemlerine dayalı olarak sınıflama indekslerinin TIMSS-2015 matematik testi üzerinde incelenmesi [Investigation of classification indices on Timss-2015 mathematic-subtest through bayesian and nonbayesian estimation methods]. Necatibey Faculty of Education Electronic Journal of Science and Mathematics Education, 13(1), 105 124. https://doi.org/10.17522/balikesirnef.566446
De Ayala, R.J. (2009). The theory and practice of item response theory. The Guilford Press.
DeMars, C. (2010). Item response theory: understanding statistics measurement. Oxford University Press.
Demir, E. (2019). R Diliyle İstatistik Uygulamaları [Statistics Applications with R Language]. Pegem Akademi.
Feinberg, R.A., & Rubright, J.D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35, 36 49. https://doi.org/10.1111/emip.12111
Finch, H., & Edwards, J.M. (2016). Rasch model parameter estimation in the presence of a non-normal latent trait using a nonparametric Bayesian approach. Educational and Psychological Measurement, 76(4), 662 684. https://doi.org/10.1177/0013164415608418
Fraenkel, J.R., & Wallen, E. (2009). How to design and evaluate research in education. McGraw-Hills Companies.
Gao, F., & Chen, L. (2005). Bayesian or non-Bayesian: A comparison study of item parameter estimation in the three-parameter logistic model. Applied Measurement in Education, 18(4), 351-380. https://psycnet.apa.org/doi/10.1207/s15324818ame1804_2
Goldman, S.H., & Raju, N.S. (1986). Recovery of one- and two-parameter logistic item parameters: An empirical study. Educational and Psychological Measurement, 46(1), 11-21. https://doi.org/10.1177/0013164486461002
Hambleton, R.K. (1989). Principles and selected applications of item response theory. In R.L. Linn (Ed.), Educational Measurement, (pp.147-200). American Council of Education.
Hambleton, R.K., & Cook, L.L. (1983). Robustness of ítem response models and effects of test length and sample size on the precision of ability estimates. In D.J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 31-49). Vancouver.
Hambleton, R.K., & Jones, R.W. (1993). An NCME instructional module on comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38 47. https://doi.org/10.1111/j.1745-3992.1993.tb00543.x
Hambleton, R.K., & Swaminathan, H. (1985). Item response theory: Principals and applications. Kluwer Academic Publishers.
Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Sage Publications Inc.
Harwell, M., & Janosky, J. (1991). An empirical study of the effects of small datasets and varying prior variances on item parameter estimation in BILOG. Applied Psychological Measurement. 15, 279-291. https://doi.org/10.1177/014662169101500308
Harwell, M., Stone, C.A., Hsu, T.C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101 125. https://doi.org/10.1177/014662169602000201
Hoaglin, D.C., & Andrews, D.F. (1975). The reporting of computation-based results in statistics. The American Statistician, 29, 122-126. https://doi.org/10.2307/2683438
Hulin, C.L., Lissak, R.I., & Drasgow, F. (1982). Recovery of two and three-parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6, 249-260. https://psycnet.apa.org/doi/10.1177/014662168200600301
Karadavut, T. (2019). The uniform prior for bayesian estimation of ability in item response theory models. International Journal of Assessment Tools in Education, 6(4), 568-579. https://dx.doi.org/10.21449/ijate.581314
Kıbrıslıoğlu Uysal, N. (2020). Parametrik ve Parametrik Olmayan Madde Tepki Modellerinin Kestirim Hatalarının Karşılaştırılması [Comparison of estimation errors in parametric and nonparametric item response theory models] [Unpublished doctoral dissertation, Hacettepe University]. http://hdl.handle.net/11655/22495
Kirisci, L., Hsu, T.C., & Yu, L. (2001). Robustness of item parameter estimation programs to assumptions of unidimensionality and normality. Applied Psychological Measurement, 25(2), 146-162. https://doi.org/10.1177/01466210122031975
Kolen, M.J. (1985). Standard errors of Tucker equating. Applied Psychological Measurement, 9(2), 209-223. https://doi.org/10.1177/014662168500900209
Kothari, C.R. (2004). Research methodology: methods and techniques (2nd ed.). New Age International Publishers.
Köse, İ.A. (2010). Madde Tepki Kuramına Dayalı Tek Boyutlu ve Çok Boyutlu Modellerin Test Uzunluğu ve Örneklem Büyüklüğü Açısından Karşılaştırılması [Comparison of Unidimensional and Multidimensional Models Based On Item Response Theory In Terms of Test Length and Sample Size] [Unpublished doctoral dissertation]. Ankara University, Institute of Educational Sciences.
Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. frontiers in Psychology, 4(863), 1-12. https://doi.org/10.3389/fpsyg.2013.00863
Lenth, R.V. (2022, December). emmeans: Estimated marginal means, aka Least-Squares Means. https://cran.r-project.org/web/packages/emmeans/index.html
Lim, H., & Wells, C.S. (2020). irtplay: An R package for online item calibration, scoring, evaluation of model fit, and useful functions for unidimensional IRT. Applied psychological measurement, 44(7 8), 563 565. https://doi.org/10.1177/0146621620921247
Linacre, J.M. (2008). A user’s guide to winsteps ministep: rasch-model computer programs. https://www.winsteps.com/winman/copyright.htm
Lord, F.M. (1968). An analysis of the verbal scholastic aptitude test using Birnbaum’s three-parameter logistic model. Educational and Psychological Measurement, 28, 989-1020. https://doi.org/10.1177/001316446802800401
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates.
Lord, F.M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel forms reliability. Psychometrika, 48, 233 245. https://doi.org/10.1007/BF02294018
Martin, A.D., & Quinn, K.M. (2006). Applied Bayesian inference in R using MCMCpack. The Newsletter of the R Project, 6(1), 2-7.
Martinez, J. (2017, December 1). bairt: Bayesian analysis of item response theory models. http://cran.nexr.com/web/packages/bairt/index.html
Maydeu-Ovivares, A., & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrica, 71, 713 732. https://doi.org/10.1007/s11336-005-1295-9
Meyer, D. (2022, December 1). e1071: Misc functions of the department of statistics, Probability Theory Group (Formerly: E1071), TU Wien. https://CRAN.R-project.org/package=e1071
Mislevy, R.J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177-195. https://doi.org/10.1007/BF02293979
MoNE (2022). Sınavla Öğrenci Alacak Ortaöğretim Kurumlarına İlişkin Merkezî Sınav Başvuru ve Uygulama Kılavuzu [Central Examination Application and Administration Guide for Secondary Education Schools to Admit Students by Examination]. Ankara: MoNE [MEB]. https://www.meb.gov.tr/2022-lgs-kapsamindaki-merkez-sinav-kilavuzu-yayimlandi/haber/25705/tr
Morris, T.P., White, I.R., & Crowther, M.J. (2017). Using simulation studies to evaluate statistical methods. Tutorial in Biostatistics, 38(11), 2074 2102. https://doi.org/10.1002/sim.8086
Orlando, M. (2004, June). Critical issues to address when applying item response theory models. Paper presented at the conference on improving health outcomes assessment, National Cancer institute, Bethesda, MD, USA.
Pekmezci Barış, F. (2018). İki Faktör Modelde (Bifactor) Diklik Varsayımının Farklı Koşullar Altında Sınanması [Investigation Of Orthogonality Assumption In Bifactor Model Under Different Conditions] [Unpublished doctoral dissertation]. Ankara University, Institute of Educational Sciences, Ankara.
Ree, M.J., & Jensen, H.E. (1980). Effects of sample size on linear equating of item characteristic curve parameters. In D.J. Weiss (Ed.), Proceedings of the 1979 computerized adaptive testing conference. (pp. 218-228). Minneapolis: University of Minnesota. https://doi.org/10.1016/B978-0-12-742780-5.50017-2
Reise, S.P., & Yu, J. (1990). Parameter recovery in the graded response model using MULTILOG. Journal of Educational Measurement, 27, 133 144. https://doi.org/10.1111/j.1745-3984.1990.tb00738.x
Revelle, W. (2022, October). psych: Procedures for psychological, psychometric, and personality research. https://cran.r-project.org/web/packages/psych/index.html
Robitzsch, A. (2022). sirt: Supplementary item response theory models. https://cran.r-project.org/web/packages/sirt/index.html
Samejima, F. (1993a). An approximation for the bias function of the maximum likelihood estimate of a latent variable for the general case where the item responses are discrete. Psychometrika, 58, 119-138. https://doi.org/10.1007/BF02294476
Samejima, F. (1993b). The bias function of the maximum likelihood estimate of ability for the dichotomous response level. Psychometrika, 58, 195 209. https://doi.org/10.1007/BF02294573
Sarkar, D. (2022, October). lattice: Trellis graphics for R. R package version 0.20-45, URL http://CRAN.R-project.org/package=lattice.
SAS Institute (2020). Introduction to Bayesian analysis procedures. In User’s Guide Introduction to Bayesian Analysis Procedures. (pp. 127-161). SAS Institute Inc., Cary, (NC), USA.
Sass, D., Schmitt, T., & Walker, C. (2008). Estimating non-normal latent trait distributions within item response theory using true and estimated item parameters. Applied Measurement in Education, 21(1), 65-88. https://doi.org/10.1080/08957340701796415
Seong, T.J. (1990). Sensitivity of marginal maximum likelihood estimation of item and ability parameters to the characteristics of the prior ability distributions. Applied Psychological Measurement, 14(3), 299 311. https://psycnet.apa.org/doi/10.1177/014662169001400307
Singmann, H. (2022, December). afex: Analysis of factorial experiments. https://cran.r-project.org/web/packages/afex/afex.pdf
Soysal, S. (2017). Toplam Test Puanı ve Alt Test Puanlarının Kestiriminin Hiyerarşik Madde Tepki Kuramı Modelleri ile Karşılaştırılması [Comparison of Estimation of Total Score and Subscores with Hierarchical Item Response Theory Models] [Unpublished doctoral dissertation]. Hacettepe University, Institute of Educational Sciences, Ankara.
Stone, C.A. (1992). Recovery of marginal maximum likelihood estimates in the two parameter logistic response model: An evaluation of MULTILOG. Applied Psychological Measurement, 16(1), 1-16. https://doi.org/10.1177/014662169201600101
Swaminathan, H., & Gifford, J.A. (1986). Bayesian estimation in the three-parameter logistic model. Psychometrika, 51, 589-601. https://doi.org/10.1007/BF02295598
Şahin, A., & Anıl, D. (2017). The effects of test length and sample size on item parameters in item response theory. Educational Sciences: Theory & Practice, 17, 321-335. http://dx.doi.org/10.12738/estp.2017.1.0270
Tabachnick, B.G., & Fidell, L.S. (2014). Using multivariate statistics (6th ed.). Pearson New International Edition.
Thissen, D., & Wainer, H. (1983). Some standard errors in item response theory. Psychometrika, 47, 397-412. https://doi.org/10.1007/BF02293705
Thorndike, L.R. (1982). Applied Psychometrics. Houghton Mifflin Co.
Van de Schoot, R., & Depaoli, S. (2014). Bayesian analyses: Where to start and what to report. The European Health Psychologist, 16(2), 75 84.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag. https://doi.org/10.1007/978-0-387-98141-3
Wright, B.D., & Stone, M.H. (1979). Best test design. Mesa Press
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three parameter logistic model. Applied Psychological Measurement, 8, 125-145. https://doi.org/10.1177/014662168400800201

Comparison of item response theory ability and item parameters according to classical and Bayesian estimation methods

Yıl 2024, Cilt: 11 Sayı: 2, 213 - 248, 20.06.2024

Eray Selçuk , Ergül Demir

https://doi.org/10.21449/ijate.1290831

Öz

Anahtar Kelimeler

IRT parameter estimation, Maximum likelihood estimation, Bayesian estimation method, MCMC, RMSE

Kaynakça

Akour, M., & Al-Omari, H. (2013). Empirical investigation of the stability of IRT item-parameters estimation. International Online Journal of Educational Sciences, 5(2), 291-301.
Atar, B. (2007). Differential item functioning analyses for mixed response data using IRT likelihood-ratio test, logistic regression, and GLLAMM procedures [Unpublished doctoral dissertation, Florida State University]. http://purl.flvc.org/fsu/fd/FSU_migr_etd-0248
Baker, F.B. (2001). The basics of item response theory (2nd ed.). College Park, (MD): ERIC Clearinghouse on Assessment and Evaluation.
Baker, F.B., & Kim, S. (2004). Item response theory: Parameter estimation techniques (2nd ed.). Marcel Dekker.
Barış-Pekmezci, F. & Şengül-Avşar, A. (2021). A guide for more accurate and precise estimations in simulative unidimensional IRT models. International Journal of Assessment Tools in Education, 8(2), 423-453. https://doi.org/10.21449/ijate.790289
Bilir, M.K. (2009). Mixture item response theory-mimic model: Simultaneous estimation of differential item functioning for manifest groups and latent classes [Unpublished doctoral dissertation, Florida State University]. http://diginole.lib.fsu.edu/islandora/object/fsu:182011/datastream/PDF/view
Bock, R.D., & Mislevy, R.J. (1982). Adaptive EAP estimation of ability in a microcomputer environment. Applied Psychological Measurement, 6, 431 444. https://doi.org/10.1177/014662168200600405
Bulmer, M.G. (1979). Principles of statistics. Dover Publications.
Bulut, O. & Sünbül, Ö. (2017). R programlama dili ile madde tepki kuramında monte carlo simülasyon çalışmaları [Monte carlo simulation studies in item response theory with the R programming language]. Journal of Measurement and Evaluation in Education and Psychology, 8(3), 266-287. https://doi.org/10.21031/epod.305821
Chalmers, R.P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48(6), 1 29. https://doi.org/10.18637/jss.v048.i06
Chuah, S.C., Drasgow F., & Luecht, R. (2006). How big is big enough? Sample size requirements for cast item parameter estimation. Applied Measurement in Education, 19(3), 241-255. https://doi.org/10.1207/s15324818ame1903_5
Clarke, E. (2022, December 22). ggbeeswarm: Categorical scatter (violin point) plots. https://cran.r-project.org/web/packages/ggbeeswarm/index.html
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Lawrence Erlbaum Associates, Publishers.
Crocker, L., & Algina, J. (1986). Introduction to classical and modern test theory. Wadsworth Group.
Çelikten, S. & Çakan, M. (2019). Bayesian ve nonBayesian kestirim yöntemlerine dayalı olarak sınıflama indekslerinin TIMSS-2015 matematik testi üzerinde incelenmesi [Investigation of classification indices on Timss-2015 mathematic-subtest through bayesian and nonbayesian estimation methods]. Necatibey Faculty of Education Electronic Journal of Science and Mathematics Education, 13(1), 105 124. https://doi.org/10.17522/balikesirnef.566446
De Ayala, R.J. (2009). The theory and practice of item response theory. The Guilford Press.
DeMars, C. (2010). Item response theory: understanding statistics measurement. Oxford University Press.
Demir, E. (2019). R Diliyle İstatistik Uygulamaları [Statistics Applications with R Language]. Pegem Akademi.
Feinberg, R.A., & Rubright, J.D. (2016). Conducting simulation studies in psychometrics. Educational Measurement: Issues and Practice, 35, 36 49. https://doi.org/10.1111/emip.12111
Finch, H., & Edwards, J.M. (2016). Rasch model parameter estimation in the presence of a non-normal latent trait using a nonparametric Bayesian approach. Educational and Psychological Measurement, 76(4), 662 684. https://doi.org/10.1177/0013164415608418
Fraenkel, J.R., & Wallen, E. (2009). How to design and evaluate research in education. McGraw-Hills Companies.
Gao, F., & Chen, L. (2005). Bayesian or non-Bayesian: A comparison study of item parameter estimation in the three-parameter logistic model. Applied Measurement in Education, 18(4), 351-380. https://psycnet.apa.org/doi/10.1207/s15324818ame1804_2
Goldman, S.H., & Raju, N.S. (1986). Recovery of one- and two-parameter logistic item parameters: An empirical study. Educational and Psychological Measurement, 46(1), 11-21. https://doi.org/10.1177/0013164486461002
Hambleton, R.K. (1989). Principles and selected applications of item response theory. In R.L. Linn (Ed.), Educational Measurement, (pp.147-200). American Council of Education.
Hambleton, R.K., & Cook, L.L. (1983). Robustness of ítem response models and effects of test length and sample size on the precision of ability estimates. In D.J. Weiss (Ed.), New horizons in testing: Latent trait test theory and computerized adaptive testing (pp. 31-49). Vancouver.
Hambleton, R.K., & Jones, R.W. (1993). An NCME instructional module on comparison of classical test theory and item response theory and their applications to test development. Educational Measurement: Issues and Practice, 12(3), 38 47. https://doi.org/10.1111/j.1745-3992.1993.tb00543.x
Hambleton, R.K., & Swaminathan, H. (1985). Item response theory: Principals and applications. Kluwer Academic Publishers.
Hambleton, R.K., Swaminathan, H., & Rogers, H.J. (1991). Fundamentals of item response theory. Sage Publications Inc.
Harwell, M., & Janosky, J. (1991). An empirical study of the effects of small datasets and varying prior variances on item parameter estimation in BILOG. Applied Psychological Measurement. 15, 279-291. https://doi.org/10.1177/014662169101500308
Harwell, M., Stone, C.A., Hsu, T.C., & Kirisci, L. (1996). Monte Carlo studies in item response theory. Applied Psychological Measurement, 20(2), 101 125. https://doi.org/10.1177/014662169602000201
Hoaglin, D.C., & Andrews, D.F. (1975). The reporting of computation-based results in statistics. The American Statistician, 29, 122-126. https://doi.org/10.2307/2683438
Hulin, C.L., Lissak, R.I., & Drasgow, F. (1982). Recovery of two and three-parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6, 249-260. https://psycnet.apa.org/doi/10.1177/014662168200600301
Karadavut, T. (2019). The uniform prior for bayesian estimation of ability in item response theory models. International Journal of Assessment Tools in Education, 6(4), 568-579. https://dx.doi.org/10.21449/ijate.581314
Kıbrıslıoğlu Uysal, N. (2020). Parametrik ve Parametrik Olmayan Madde Tepki Modellerinin Kestirim Hatalarının Karşılaştırılması [Comparison of estimation errors in parametric and nonparametric item response theory models] [Unpublished doctoral dissertation, Hacettepe University]. http://hdl.handle.net/11655/22495
Kirisci, L., Hsu, T.C., & Yu, L. (2001). Robustness of item parameter estimation programs to assumptions of unidimensionality and normality. Applied Psychological Measurement, 25(2), 146-162. https://doi.org/10.1177/01466210122031975
Kolen, M.J. (1985). Standard errors of Tucker equating. Applied Psychological Measurement, 9(2), 209-223. https://doi.org/10.1177/014662168500900209
Kothari, C.R. (2004). Research methodology: methods and techniques (2nd ed.). New Age International Publishers.
Köse, İ.A. (2010). Madde Tepki Kuramına Dayalı Tek Boyutlu ve Çok Boyutlu Modellerin Test Uzunluğu ve Örneklem Büyüklüğü Açısından Karşılaştırılması [Comparison of Unidimensional and Multidimensional Models Based On Item Response Theory In Terms of Test Length and Sample Size] [Unpublished doctoral dissertation]. Ankara University, Institute of Educational Sciences.
Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: a practical primer for t-tests and ANOVAs. frontiers in Psychology, 4(863), 1-12. https://doi.org/10.3389/fpsyg.2013.00863
Lenth, R.V. (2022, December). emmeans: Estimated marginal means, aka Least-Squares Means. https://cran.r-project.org/web/packages/emmeans/index.html
Lim, H., & Wells, C.S. (2020). irtplay: An R package for online item calibration, scoring, evaluation of model fit, and useful functions for unidimensional IRT. Applied psychological measurement, 44(7 8), 563 565. https://doi.org/10.1177/0146621620921247
Linacre, J.M. (2008). A user’s guide to winsteps ministep: rasch-model computer programs. https://www.winsteps.com/winman/copyright.htm
Lord, F.M. (1968). An analysis of the verbal scholastic aptitude test using Birnbaum’s three-parameter logistic model. Educational and Psychological Measurement, 28, 989-1020. https://doi.org/10.1177/001316446802800401
Lord, F.M. (1980). Applications of item response theory to practical testing problems. Lawrence Erlbaum Associates.
Lord, F.M. (1983). Unbiased estimators of ability parameters, of their variance, and of their parallel forms reliability. Psychometrika, 48, 233 245. https://doi.org/10.1007/BF02294018
Martin, A.D., & Quinn, K.M. (2006). Applied Bayesian inference in R using MCMCpack. The Newsletter of the R Project, 6(1), 2-7.
Martinez, J. (2017, December 1). bairt: Bayesian analysis of item response theory models. http://cran.nexr.com/web/packages/bairt/index.html
Maydeu-Ovivares, A., & Joe, H. (2006). Limited information goodness-of-fit testing in multidimensional contingency tables. Psychometrica, 71, 713 732. https://doi.org/10.1007/s11336-005-1295-9
Meyer, D. (2022, December 1). e1071: Misc functions of the department of statistics, Probability Theory Group (Formerly: E1071), TU Wien. https://CRAN.R-project.org/package=e1071
Mislevy, R.J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177-195. https://doi.org/10.1007/BF02293979
MoNE (2022). Sınavla Öğrenci Alacak Ortaöğretim Kurumlarına İlişkin Merkezî Sınav Başvuru ve Uygulama Kılavuzu [Central Examination Application and Administration Guide for Secondary Education Schools to Admit Students by Examination]. Ankara: MoNE [MEB]. https://www.meb.gov.tr/2022-lgs-kapsamindaki-merkez-sinav-kilavuzu-yayimlandi/haber/25705/tr
Morris, T.P., White, I.R., & Crowther, M.J. (2017). Using simulation studies to evaluate statistical methods. Tutorial in Biostatistics, 38(11), 2074 2102. https://doi.org/10.1002/sim.8086
Orlando, M. (2004, June). Critical issues to address when applying item response theory models. Paper presented at the conference on improving health outcomes assessment, National Cancer institute, Bethesda, MD, USA.
Pekmezci Barış, F. (2018). İki Faktör Modelde (Bifactor) Diklik Varsayımının Farklı Koşullar Altında Sınanması [Investigation Of Orthogonality Assumption In Bifactor Model Under Different Conditions] [Unpublished doctoral dissertation]. Ankara University, Institute of Educational Sciences, Ankara.
Ree, M.J., & Jensen, H.E. (1980). Effects of sample size on linear equating of item characteristic curve parameters. In D.J. Weiss (Ed.), Proceedings of the 1979 computerized adaptive testing conference. (pp. 218-228). Minneapolis: University of Minnesota. https://doi.org/10.1016/B978-0-12-742780-5.50017-2
Reise, S.P., & Yu, J. (1990). Parameter recovery in the graded response model using MULTILOG. Journal of Educational Measurement, 27, 133 144. https://doi.org/10.1111/j.1745-3984.1990.tb00738.x
Revelle, W. (2022, October). psych: Procedures for psychological, psychometric, and personality research. https://cran.r-project.org/web/packages/psych/index.html
Robitzsch, A. (2022). sirt: Supplementary item response theory models. https://cran.r-project.org/web/packages/sirt/index.html
Samejima, F. (1993a). An approximation for the bias function of the maximum likelihood estimate of a latent variable for the general case where the item responses are discrete. Psychometrika, 58, 119-138. https://doi.org/10.1007/BF02294476
Samejima, F. (1993b). The bias function of the maximum likelihood estimate of ability for the dichotomous response level. Psychometrika, 58, 195 209. https://doi.org/10.1007/BF02294573
Sarkar, D. (2022, October). lattice: Trellis graphics for R. R package version 0.20-45, URL http://CRAN.R-project.org/package=lattice.
SAS Institute (2020). Introduction to Bayesian analysis procedures. In User’s Guide Introduction to Bayesian Analysis Procedures. (pp. 127-161). SAS Institute Inc., Cary, (NC), USA.
Sass, D., Schmitt, T., & Walker, C. (2008). Estimating non-normal latent trait distributions within item response theory using true and estimated item parameters. Applied Measurement in Education, 21(1), 65-88. https://doi.org/10.1080/08957340701796415
Seong, T.J. (1990). Sensitivity of marginal maximum likelihood estimation of item and ability parameters to the characteristics of the prior ability distributions. Applied Psychological Measurement, 14(3), 299 311. https://psycnet.apa.org/doi/10.1177/014662169001400307
Singmann, H. (2022, December). afex: Analysis of factorial experiments. https://cran.r-project.org/web/packages/afex/afex.pdf
Soysal, S. (2017). Toplam Test Puanı ve Alt Test Puanlarının Kestiriminin Hiyerarşik Madde Tepki Kuramı Modelleri ile Karşılaştırılması [Comparison of Estimation of Total Score and Subscores with Hierarchical Item Response Theory Models] [Unpublished doctoral dissertation]. Hacettepe University, Institute of Educational Sciences, Ankara.
Stone, C.A. (1992). Recovery of marginal maximum likelihood estimates in the two parameter logistic response model: An evaluation of MULTILOG. Applied Psychological Measurement, 16(1), 1-16. https://doi.org/10.1177/014662169201600101
Swaminathan, H., & Gifford, J.A. (1986). Bayesian estimation in the three-parameter logistic model. Psychometrika, 51, 589-601. https://doi.org/10.1007/BF02295598
Şahin, A., & Anıl, D. (2017). The effects of test length and sample size on item parameters in item response theory. Educational Sciences: Theory & Practice, 17, 321-335. http://dx.doi.org/10.12738/estp.2017.1.0270
Tabachnick, B.G., & Fidell, L.S. (2014). Using multivariate statistics (6th ed.). Pearson New International Edition.
Thissen, D., & Wainer, H. (1983). Some standard errors in item response theory. Psychometrika, 47, 397-412. https://doi.org/10.1007/BF02293705
Thorndike, L.R. (1982). Applied Psychometrics. Houghton Mifflin Co.
Van de Schoot, R., & Depaoli, S. (2014). Bayesian analyses: Where to start and what to report. The European Health Psychologist, 16(2), 75 84.
Wickham, H. (2016). ggplot2: Elegant graphics for data analysis. Springer-Verlag. https://doi.org/10.1007/978-0-387-98141-3
Wright, B.D., & Stone, M.H. (1979). Best test design. Mesa Press
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three parameter logistic model. Applied Psychological Measurement, 8, 125-145. https://doi.org/10.1177/014662168400800201

Toplam 76 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Eğitim Üzerine Çalışmalar
Bölüm	Makaleler
Yazarlar	Eray Selçuk 0000-0003-4033-4219 Ergül Demir 0000-0002-3708-8013
Erken Görünüm Tarihi	22 Mayıs 2024
Yayımlanma Tarihi	20 Haziran 2024
Gönderilme Tarihi	1 Mayıs 2023
Yayımlandığı Sayı	Yıl 2024 Cilt: 11 Sayı: 2

Kaynak Göster

APA	Selçuk, E., & Demir, E. (2024). Comparison of item response theory ability and item parameters according to classical and Bayesian estimation methods. International Journal of Assessment Tools in Education, 11(2), 213-248. https://doi.org/10.21449/ijate.1290831

Makale Dosyaları

Tam Metin

23823 23825 23824