COMPARING EQUATING ERRORS ON VARIOUS FACTORS FOR SUBTESTS WHICH HAVE ADDED VALUE

Arzu Uçar; Önder Sünbül

doi:10.48166/ejaes.1438652

Research Article

ARTI DEĞER ÖZELLİĞİNE SAHİP ALT TESTLERDE ÇEŞİTLİ FAKTÖRLER ALTINDA EŞİTLEME HATALARININ KARŞILAŞTIRILMASI

Year 2024, Volume: 6 Issue: 1, 92 - 111, 25.06.2024

Arzu Uçar , Önder Sünbül

https://doi.org/10.48166/ejaes.1438652

Abstract

Bu çalışmada ortak maddelere dayalı eşitleme deseninde her bir alt testi artı değer özelliğine sahip olan testlerde, alt test ve genişletilmiş alt test puanları kullanılarak eşitleme yapılmıştır. Kullanılan eşitleme yöntemlerinin eşitleme hataları, örneklem büyüklüğüne, alt testler arasındaki ortalama güçlük düzeyi farkına ve alt test uzunluğuna göre karşılaştırılmıştır.
Veriler iki parametreli lojistik modele (2PLM) uygun olarak; her biri iki alt testten oluşan X ve Y test formu için 1-0 şeklinde iki kategorili veriler R 3.1.1. programıyla üretilmiştir. Ankor test, toplam test formu gibi iki alt testten oluşmaktadır. Ankor testin alt test uzunluğu, toplam testin alt test uzunluğunun %40’ı oranında madde sayısına sahiptir. 20, 25, 50, 100, 200 ve 500 örneklem büyüklükleri için her bir testin alt testleri arasında 0.70, 0.80 ve 0.90 korelasyonlu; test formları arasında 0.0, 0.40, 0.70 ortalama güçlük düzeyi farkı olan; alt test uzunluğu 10, 15, 30, 50 ve 80 olan X ve Y test formları oluşturulmuştur. Çalışmada birim, zincirlenmiş lineer, Baun/Holland ve dairesel-yay eşitleme yöntemleri kullanılarak 100 replikasyon sonucunda alt testler eşitlenmiştir. Yöntemlerin eşitleme sonuçları eşitleme hatası ölçütleriyle değerlendirilmiştir.
Çalışmada 100, 200 ve 500 örneklem büyüklüğüne sahip, alt test uzunluğu 10, 15 ve 30 ve test formları arasındaki ortalama güçlük düzeyi farkı 0.0 olduğunda eşitleme yapılması uygun görülürken; dairesel-yay eşitleme yöntemi diğer yöntemlere göre daha az hata değeri göstermiştir.

Keywords

Eşitleme, artı değer, genişletilmiş alt puan, eşitleme hatası

References

Albano, A.D. (2016).equate: An R Package for Observed-Score Linking and Equating. R package version 2.0-3. URL http://CRAN.R-project.org/package=equate.
Angoff, W. H. (1984). Scales, norms, and equivalent scores. Princeton, NJ: Educational Testing Service.
Baykul, Yaşar (2010). Eğitimde ve Psikolojide Ölçme: Klasik Test Teorisi ve Uygulaması. Ankara: Pegem Yayınları
Crocker, L. and Algina, J. (1986). Introduction to Classical and Modern Test Theory. New York: Holt, Rinehart and Winston.
Chu, K. L. & Kamata, A. (2003). Test equating with the presence of DIF. Paper presented at the annual meeting of American Educational Research Association, Chicago.
Haberman, S. J. (2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 33, 204-229.
Kan, A. (2010). Test Eşitleme: Aynı Davranışları Ölçen, Farklı Madde Formlarına Sahip Testlerin İstatistiksel Eşitliğinin Sınanması. Eğitim ve Psikolojide Ölçme ve Değerlendirme Dergisi, 16-21.
Kim, S., von Davier, A. A., & Haberman, S. (2008). Small sample equating using a synthetic linking function. Journal of Educational Measurement, 45, 325-342.
Kolen, M.J., & Brennan, R.L. (2004). Test equating: Methods and practices (2nd ed.).New York, NY:Springer-Verlag.
Kolen, M. J. & Brennan R. L. (2014). Test equating, Scaling, and Linking: Method and Practice (Third ed.). New York, NY: Springer.
Livingston, S. A. (2004). Equating test scores (without IRT). Princeton, NJ: ETS.
Livingston, S. A. & Kim, S. (2009). The circle-arc method for equating in small samples. Journal of Educational Measurement, 46(3), 330–343.
Sinharay, S., & Haberman, S. J. (2011). Equating of augmented sub- scores. Journal of Educational Measurement, 48, 122–145.
Sinharay,S., Haberman, S. & Puhan, G. (2007). Subscores Based on Classical Test Theory: To Report or Not to Report. Princeton, NJ: Educational Testing Service.
Sinharay, S., Puhan, G., & Haberman, S. J. (2011). An NCME instructional module on subscores. Educational Measurement: Issues and Practice, 30(3), 29–40.
Sinharay, S. (2010a). When Can Subscores Be Expected To Have Added Value? Results FromOperational and Simulated Data. Princeton, NJ: Educational Testing Service.
Sinharay, S. (2010b). How Often Do Subscores Have Added Value? Results From Operational and Simulated Data. Journal of Educational Measurement, 47(2), 150–174.
Sinharay, S. & Holland, P.W. (2007). Is It Necessary to Make Anchor Tests Mini-Versions of the Tests Being Equated or Can Some Restrictions Be Relaxed? Journal of Educational Measurement, 44(3), 249–275.
Von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of test equating. New York, NY: Springer-Verlag.
von Davier, A.A. (2010). Statistical Models For Test Equating, Scaling and Linking. New York: Springer.
Wang, T. (2006). Standard errors of equating for equipercentile equating with log-linear pre-moothing using the delta method. Iowa City: Center for Advanced Studies in Measurement and Assessment (CASMA).

COMPARING EQUATING ERRORS ON VARIOUS FACTORS FOR SUBTESTS WHICH HAVE ADDED VALUE

Year 2024, Volume: 6 Issue: 1, 92 - 111, 25.06.2024

Arzu Uçar , Önder Sünbül

https://doi.org/10.48166/ejaes.1438652

Abstract

In this study dichotomous data which was according with two parameter logistic model (2PLM) was produced for form X and form Y. R 3.1.1. programming language was used to produce data. Each test form had two subtest. Ancor test had two subtest, too. Its’ subtest length was 40% of total test form X (Y). For both forms, corelation between subtests were altered in two level (0.70, 0.80 and 0.90). Moreover, average difficulty difference between subtest of form X and Y were altered in three levels (0.0, 0.4 and 0.7). Simulated forms were equated by using identity, chained linear, Braun/Holland and circle-arc methods for six different sample size (20, 25, 50, 100, 200 and 500) with 100 replications. The results obtained from this simulation study were evaluated based on equating error criterions.
The findings indicated in the case when sample size was 100 and more, subtest length was 10, 15 and 30 and the level of average difficulty difference between form 0.0, it was concluded that equating forms would give better results than not equating. Furthermore circle-arc method was found to less equating error than other equating methods under most of the conditions studied.

Keywords

Equating, added value, augmented subscore, equating error

References

Albano, A.D. (2016).equate: An R Package for Observed-Score Linking and Equating. R package version 2.0-3. URL http://CRAN.R-project.org/package=equate.
Angoff, W. H. (1984). Scales, norms, and equivalent scores. Princeton, NJ: Educational Testing Service.
Baykul, Yaşar (2010). Eğitimde ve Psikolojide Ölçme: Klasik Test Teorisi ve Uygulaması. Ankara: Pegem Yayınları
Crocker, L. and Algina, J. (1986). Introduction to Classical and Modern Test Theory. New York: Holt, Rinehart and Winston.
Chu, K. L. & Kamata, A. (2003). Test equating with the presence of DIF. Paper presented at the annual meeting of American Educational Research Association, Chicago.
Haberman, S. J. (2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 33, 204-229.
Kan, A. (2010). Test Eşitleme: Aynı Davranışları Ölçen, Farklı Madde Formlarına Sahip Testlerin İstatistiksel Eşitliğinin Sınanması. Eğitim ve Psikolojide Ölçme ve Değerlendirme Dergisi, 16-21.
Kim, S., von Davier, A. A., & Haberman, S. (2008). Small sample equating using a synthetic linking function. Journal of Educational Measurement, 45, 325-342.
Kolen, M.J., & Brennan, R.L. (2004). Test equating: Methods and practices (2nd ed.).New York, NY:Springer-Verlag.
Kolen, M. J. & Brennan R. L. (2014). Test equating, Scaling, and Linking: Method and Practice (Third ed.). New York, NY: Springer.
Livingston, S. A. (2004). Equating test scores (without IRT). Princeton, NJ: ETS.
Livingston, S. A. & Kim, S. (2009). The circle-arc method for equating in small samples. Journal of Educational Measurement, 46(3), 330–343.
Sinharay, S., & Haberman, S. J. (2011). Equating of augmented sub- scores. Journal of Educational Measurement, 48, 122–145.
Sinharay,S., Haberman, S. & Puhan, G. (2007). Subscores Based on Classical Test Theory: To Report or Not to Report. Princeton, NJ: Educational Testing Service.
Sinharay, S., Puhan, G., & Haberman, S. J. (2011). An NCME instructional module on subscores. Educational Measurement: Issues and Practice, 30(3), 29–40.
Sinharay, S. (2010a). When Can Subscores Be Expected To Have Added Value? Results FromOperational and Simulated Data. Princeton, NJ: Educational Testing Service.
Sinharay, S. (2010b). How Often Do Subscores Have Added Value? Results From Operational and Simulated Data. Journal of Educational Measurement, 47(2), 150–174.
Sinharay, S. & Holland, P.W. (2007). Is It Necessary to Make Anchor Tests Mini-Versions of the Tests Being Equated or Can Some Restrictions Be Relaxed? Journal of Educational Measurement, 44(3), 249–275.
Von Davier, A. A., Holland, P. W., & Thayer, D. T. (2004). The kernel method of test equating. New York, NY: Springer-Verlag.
von Davier, A.A. (2010). Statistical Models For Test Equating, Scaling and Linking. New York: Springer.
Wang, T. (2006). Standard errors of equating for equipercentile equating with log-linear pre-moothing using the delta method. Iowa City: Center for Advanced Studies in Measurement and Assessment (CASMA).

There are 21 citations in total.

Details

Primary Language	English
Subjects	Specialist Studies in Education (Other)
Journal Section	Articles
Authors	Arzu Uçar 0000-0002-0099-1348 Önder Sünbül 0000-0002-1775-1404
Early Pub Date	June 25, 2024
Publication Date	June 25, 2024
Submission Date	February 17, 2024
Acceptance Date	April 17, 2024
Published in Issue	Year 2024 Volume: 6 Issue: 1

Cite

APA	Uçar, A., & Sünbül, Ö. (2024). COMPARING EQUATING ERRORS ON VARIOUS FACTORS FOR SUBTESTS WHICH HAVE ADDED VALUE. Journal of Advanced Education Studies, 6(1), 92-111. https://doi.org/10.48166/ejaes.1438652

Download Cover Image

Article Files

Full Text