Comparison of Kernel Equating and Kernel Local Equating in Item Response Theory Observed Score Equating

Merve Yıldırım Seheryeli; Hasibe Yahsi Sarı; Hülya Kelecioğlu

doi:10.21031/epod.900843

Research Article

Year 2021, Volume: 12 Issue: 4, 348 - 357, 29.12.2021

Merve Yıldırım Seheryeli , Hasibe Yahsi Sarı , Hülya Kelecioğlu

https://doi.org/10.21031/epod.900843

Abstract

References

Akın Arıkan, Ç. (2017). Kernel eşitleme ve madde tepki kuramına dayalı eşitleme yöntemlerinin karşılaştırılması (Yayımlanmış Doktora Tezi). Hacettepe Üniversitesi, Eğitim Bilimleri Enstitüsü, Ankara.
Andersson, B., & Wiberg, M. (2014). IRT observed-score kernel equating with the R package kequate. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.372.8712&rep=rep1&type=pdf
Andersson, B., Bränberg, K., & Wiberg, M. (2020). Package ‘kequate’. Retrieved from https://mran.microsoft.com/snapshot/2020-03-08/web/packages/kequate/kequate.pdf
Baker, F. B. (2016). Madde tepki kuramının temelleri [The basics of item response theory]. (N. Güler, Ed., & M. İlhan, Çev.). Ankara: Pegem Akademi. (1985)
Chalmers, P., Pritikin, J., Robitzsch, A., Zoltak, M., Kim K. H., Falk C. F., …, and Oguzhan, O. (2021). Package ‘mirt’. Retrieved from https://cran.r-project.org/web/packages/mirt/mirt.pdf
Choi, S. I. (2009). A comparison of kernel equating and traditional equipercentile equating methods and the parametric bootstrap methods for estimating Standard errors in equipercentile equating (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign.
Diao, H. (2018). Investigation repeater effects on small-sample equating: Include or exclude? (Doctoral thesis). University of Massachusetts-Amherst.
Gök, B., & Kelecioğlu, H. (2014). Denk olmayan gruplarda ortak madde deseni kullanılarak madde tepki kuramına dayalı eşitleme yöntemlerinin karşılaştırılması. Mersin Üniversitesi Eğitim Fakültesi Dergisi, 10(1), 120-136. https://dergipark.org.tr/tr/download/article-file/161036 adresinden erişilmiştir.
González, J., & Wiberg, M. (2017). Applying test equating methods: Using R. Switzerland: Springer International Publishing. Retrieved from http://www.mat.uc.cl/~jorge.gonzalez/index_archivos/EquatingRbook.htm
Haladyna, T. M., & Downing, S. M. (2004). Construct-irrelevant variance in highstakes testing. Educational Measurement: Issues and Practice., 23(1), 17-27. doi: 10.1111/J.1745-3992.2004.TB00149.X
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Baston: Kuluwer-Nijhoff Publisihing.
Holland, P. W., & Thayer, D. T. (1981). Section pre‐equating the graduate record examinations (ETS Research Report Series). 1981(2), i-62.
Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer.
Liou, M., Cheng, P. E., & Johnson, E. G. (1997). Standard errors of the Kernel equating methods under the common-item design. Applied Psychological Measurement, 21(4), 349-369. doi: 10.1177/01466216970214005
Norman Dvorak, R. L. (2009). A comparison of kernel equating to the test characteristic curve method (Unpublished doctoral dissertation). University of Nebraska-Linkoln.
Öztürk-Gübeş, N. (2019). Test eşitlemede çok boyutluluğun eş zamanlı ve ayrı kalibrasyona etkisi. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 34(4), 1061-1074. doi: 10.16986/HUJE.2019049186
Öztürk-Gübeş, N., & Kelecioğlu, H. (2015). Farklı test eşitleme yöntemlerinin eşitlik özelliği ölçütüne göre karşılaştırılması. Ankara Üniversitesi Eğitim Bilimleri Fakültesi Dergisi, 48(1), 299-214. doi: 10.1501/Egifak_0000001358
Pektaş, S., & Kılınç, M. (2016). PISA 2012 matematik testlerinden iki kitapçığın gözlenen puan eşitleme yöntemleri ile eşitlenmesi. Mehmet Akif Ersoy Üniversitesi Eğitim Fakültesi Dergisi, 1(40), 432-444. https://dergipark.org.tr/tr/download/article-file/264191 adresinden erişilmiştir.
Revelle, W. (2021). Package ‘psych’. Retrieved from https://cran.rstudio.org/web/packages/psych/psych.pdf
Rizopoulos, D. (2018). Package ‘ltm’. Retrieved from https://cran.r-project.org/web/packages/ltm/ltm.pdf
Tanberkan-Suna, H. (2018). Grup değişmezliği özelliğinin farklı eşitleme yöntemlerinde eşitleme fonksiyonları üzerindeki etkisi (Yayımlanmış Doktora Tezi). Gazi Üniversitesi, Eğitim Bilimleri Enstitüsü, Ankara.
Uysal, İ. (2014). Madde tepki kuramına dayalı test eşitleme yöntemlerinin karma modeller üzerinde karşılaştırılması (Yayımlanmamış Yüksek Lisans Tezi). Abant İzzet Baysal Üniversitesi, Eğitim Bilimleri Enstitüsü, Bolu.
van der Linden, W. J. (2000). A test‐theoretic approach to observed‐score equating. Psychometrika, 65(4), 437-456. Retrieved from https://link.springer.com/content/pdf/10.1007/BF02296337.pdf
von Davier, A. A. (2008). New results on the linear equating methods for the non-equivalent groups design. Journal of Educational and Behavioral Statistics, 33(2), 186-203. doi: 10.3102/1076998607302633
von Davier, A. A. (2013). Observed-score equating: An overview. Psychometrika, 78(4), 605-623. doi: 10.1007/s11336-013-9319-3
von Davier, A., Holland, P. W., & Thayer, D. T. (2004). The Kernel method of equating. New York: Springer.
Wang, S., Zhang, M., & You, S. (2020). A Comparison of IRT Observed Score Kernel Equating and Several Equating Methods. Frontiers in psychology, 11, 308. doi: 10.3389/fpsyg.2020.00308
Wang, T., Lee, W. C., Brennan, R. J., & Kolen, M. J. (2008). A comparison of the frequency estimation and chained equipercentile methods under the common-item non-equivalent groups design. Applied Psychological Measurement, 32(8), 632-651. doi: 10.1177/0146621608314943
Wiberg, M., van der Linden, W. J., & von Davier, A. A. (2014). Local observed‐score kernel equating. Journal of Educational Measurement, 51(1), 57-74. doi: 10.1111/jedm.12034

Comparison of Kernel Equating and Kernel Local Equating in Item Response Theory Observed Score Equating

Year 2021, Volume: 12 Issue: 4, 348 - 357, 29.12.2021

Merve Yıldırım Seheryeli , Hasibe Yahsi Sarı , Hülya Kelecioğlu

https://doi.org/10.21031/epod.900843

Abstract

The present study aims to compare the Kernel equating and Kernel local equating methods in observed score equating. Functions and error estimates regarding the difference between raw and equated scores and the scores equated by Stocking-Lord and Haebara true-score equating methods in Kernel local equating and Kernel equating were examined in Item Response Theory Observed Score Equating. Therefore, 5, 10, and 15 external anchor items were used, and scores were obtained from two forms based on the 2PL model. R (version 3.5.3.) programming software was used for IRT assumptions, item parameters, calibration, and equating analyses. The results revealed that Stocking-Lord and Haebara true-score equating methods yielded similar results. Moreover, if the equating method is the same, estimation errors decreased when the number of anchor items increased. The mean scores obtained by Kernel equation 5 and 15 anchor items were lower than Kernel local equating, while means of Kernel equating of 10 anchor items were higher. As the number of items increased, estimation errors decreased, and Kernel local equating revealed the lowest errors in the medium score scale. Kernel equating can be used based on the related ability level if the individual’s ability distribution is known.

Keywords

Test equating, Kernel equating, Kernel local equating, item response theory, local equating

References

Akın Arıkan, Ç. (2017). Kernel eşitleme ve madde tepki kuramına dayalı eşitleme yöntemlerinin karşılaştırılması (Yayımlanmış Doktora Tezi). Hacettepe Üniversitesi, Eğitim Bilimleri Enstitüsü, Ankara.
Andersson, B., & Wiberg, M. (2014). IRT observed-score kernel equating with the R package kequate. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.372.8712&rep=rep1&type=pdf
Andersson, B., Bränberg, K., & Wiberg, M. (2020). Package ‘kequate’. Retrieved from https://mran.microsoft.com/snapshot/2020-03-08/web/packages/kequate/kequate.pdf
Baker, F. B. (2016). Madde tepki kuramının temelleri [The basics of item response theory]. (N. Güler, Ed., & M. İlhan, Çev.). Ankara: Pegem Akademi. (1985)
Chalmers, P., Pritikin, J., Robitzsch, A., Zoltak, M., Kim K. H., Falk C. F., …, and Oguzhan, O. (2021). Package ‘mirt’. Retrieved from https://cran.r-project.org/web/packages/mirt/mirt.pdf
Choi, S. I. (2009). A comparison of kernel equating and traditional equipercentile equating methods and the parametric bootstrap methods for estimating Standard errors in equipercentile equating (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign.
Diao, H. (2018). Investigation repeater effects on small-sample equating: Include or exclude? (Doctoral thesis). University of Massachusetts-Amherst.
Gök, B., & Kelecioğlu, H. (2014). Denk olmayan gruplarda ortak madde deseni kullanılarak madde tepki kuramına dayalı eşitleme yöntemlerinin karşılaştırılması. Mersin Üniversitesi Eğitim Fakültesi Dergisi, 10(1), 120-136. https://dergipark.org.tr/tr/download/article-file/161036 adresinden erişilmiştir.
González, J., & Wiberg, M. (2017). Applying test equating methods: Using R. Switzerland: Springer International Publishing. Retrieved from http://www.mat.uc.cl/~jorge.gonzalez/index_archivos/EquatingRbook.htm
Haladyna, T. M., & Downing, S. M. (2004). Construct-irrelevant variance in highstakes testing. Educational Measurement: Issues and Practice., 23(1), 17-27. doi: 10.1111/J.1745-3992.2004.TB00149.X
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications. Baston: Kuluwer-Nijhoff Publisihing.
Holland, P. W., & Thayer, D. T. (1981). Section pre‐equating the graduate record examinations (ETS Research Report Series). 1981(2), i-62.
Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer.
Liou, M., Cheng, P. E., & Johnson, E. G. (1997). Standard errors of the Kernel equating methods under the common-item design. Applied Psychological Measurement, 21(4), 349-369. doi: 10.1177/01466216970214005
Norman Dvorak, R. L. (2009). A comparison of kernel equating to the test characteristic curve method (Unpublished doctoral dissertation). University of Nebraska-Linkoln.
Öztürk-Gübeş, N. (2019). Test eşitlemede çok boyutluluğun eş zamanlı ve ayrı kalibrasyona etkisi. Hacettepe Üniversitesi Eğitim Fakültesi Dergisi, 34(4), 1061-1074. doi: 10.16986/HUJE.2019049186
Öztürk-Gübeş, N., & Kelecioğlu, H. (2015). Farklı test eşitleme yöntemlerinin eşitlik özelliği ölçütüne göre karşılaştırılması. Ankara Üniversitesi Eğitim Bilimleri Fakültesi Dergisi, 48(1), 299-214. doi: 10.1501/Egifak_0000001358
Pektaş, S., & Kılınç, M. (2016). PISA 2012 matematik testlerinden iki kitapçığın gözlenen puan eşitleme yöntemleri ile eşitlenmesi. Mehmet Akif Ersoy Üniversitesi Eğitim Fakültesi Dergisi, 1(40), 432-444. https://dergipark.org.tr/tr/download/article-file/264191 adresinden erişilmiştir.
Revelle, W. (2021). Package ‘psych’. Retrieved from https://cran.rstudio.org/web/packages/psych/psych.pdf
Rizopoulos, D. (2018). Package ‘ltm’. Retrieved from https://cran.r-project.org/web/packages/ltm/ltm.pdf
Tanberkan-Suna, H. (2018). Grup değişmezliği özelliğinin farklı eşitleme yöntemlerinde eşitleme fonksiyonları üzerindeki etkisi (Yayımlanmış Doktora Tezi). Gazi Üniversitesi, Eğitim Bilimleri Enstitüsü, Ankara.
Uysal, İ. (2014). Madde tepki kuramına dayalı test eşitleme yöntemlerinin karma modeller üzerinde karşılaştırılması (Yayımlanmamış Yüksek Lisans Tezi). Abant İzzet Baysal Üniversitesi, Eğitim Bilimleri Enstitüsü, Bolu.
van der Linden, W. J. (2000). A test‐theoretic approach to observed‐score equating. Psychometrika, 65(4), 437-456. Retrieved from https://link.springer.com/content/pdf/10.1007/BF02296337.pdf
von Davier, A. A. (2008). New results on the linear equating methods for the non-equivalent groups design. Journal of Educational and Behavioral Statistics, 33(2), 186-203. doi: 10.3102/1076998607302633
von Davier, A. A. (2013). Observed-score equating: An overview. Psychometrika, 78(4), 605-623. doi: 10.1007/s11336-013-9319-3
von Davier, A., Holland, P. W., & Thayer, D. T. (2004). The Kernel method of equating. New York: Springer.
Wang, S., Zhang, M., & You, S. (2020). A Comparison of IRT Observed Score Kernel Equating and Several Equating Methods. Frontiers in psychology, 11, 308. doi: 10.3389/fpsyg.2020.00308
Wang, T., Lee, W. C., Brennan, R. J., & Kolen, M. J. (2008). A comparison of the frequency estimation and chained equipercentile methods under the common-item non-equivalent groups design. Applied Psychological Measurement, 32(8), 632-651. doi: 10.1177/0146621608314943
Wiberg, M., van der Linden, W. J., & von Davier, A. A. (2014). Local observed‐score kernel equating. Journal of Educational Measurement, 51(1), 57-74. doi: 10.1111/jedm.12034

There are 29 citations in total.

Details

Primary Language	English
Journal Section	Articles
Authors	Merve Yıldırım Seheryeli 0000-0002-1106-5358 Hasibe Yahsi Sarı 0000-0002-0451-6034 Hülya Kelecioğlu 0000-0001-6274-2016
Publication Date	December 29, 2021
Acceptance Date	September 30, 2021
Published in Issue	Year 2021 Volume: 12 Issue: 4

Cite

APA	Yıldırım Seheryeli, M., Yahsi Sarı, H., & Kelecioğlu, H. (2021). Comparison of Kernel Equating and Kernel Local Equating in Item Response Theory Observed Score Equating. Journal of Measurement and Evaluation in Education and Psychology, 12(4), 348-357. https://doi.org/10.21031/epod.900843

Download Cover Image

Article Files

Full Text