A Comparison of Kernel Equating Methods Based on Neat Design

Cigdem Akın Arıkan

Research Article

A Comparison of Kernel Equating Methods Based on Neat Design

Year 2019, Volume: 19 Issue: 82, 27 - 44, 31.07.2019

Abstract

Problem Statement: Equating can be
defined as a statistical process that allows modifying the differences between
test forms with similar content and difficulty so that the scores obtained from
these forms can be used interchangeably. In the literature, there are many
equating methods, one of which is Kernel equating. Trends in International
Mathematics and Science Study (TIMSS) aims to find out the knowledge and skills
gained by the fourth and eighth-grade students in the fields of mathematics and
science. TIMSS have different test forms, and these forms are equated through
common items.

Purpose
of the Study: This research aimed to compare the equated
score results of the Kernel equating (KE) methods, which are chained, and post-stratification
equipercentile and linear equating methods under NEAT design.

Methodology: TIMMS Science data were
used in this study. The study sample consisted of 865 eighth-grade examinees
who were given the Booklets 1 and 14 during the TIMSS application in Turkey.
There were 39 items in Booklet 1, and 38 items in Booklet 14. Firstly,
descriptive statistics were calculated and then the two Booklets were equated
according to NEAT design based on Kernel chained, Kernel post-stratification
equipercentile, and linear equating methods. Secondly, the equating methods
were evaluated according to some criteria such as DTM, PRE, SEE, SEED, and
RMSD.

Findings and Results: It was seen that results based on equipercentile and linear equating
methods were consistent with each other, except for a high range of the score
scale. PRE values demonstrated that KE equipercentile equating methods better
matched with the discrete target distribution Y, and distribution of SEED
revealed that KE equipercentile and linear methods were not significantly
different from each other according to DTM.

Keywords

Equating , equipercentile , linear , RMSD SEED

References

Andersson, B., Bränberg, K., & Wiberg, M. (2013). Performing the kernel method of test equating with the package kequate. Journal of Statistical Software, 55(6), 1-25.
Akin-Arikan, Ç. (2017). Kernel Eşitleme ve Madde Tepki Kuramına Dayalı Eşitleme Yöntemlerinin Karşılaştırılması [Comparison of kernel equating and item response theory equating methods] (Unpublished doctoral dissertation). Hacettepe University, Ankara, Turkey.
Akin-Arikan, Ç. & Gelbal, S. (2018). A Comparison of Traditional and Kernel Equating Methods. International Journal of Assessment Tools in Education, 5(3), 417-427. doi: 10.21449/ijate.409826
Choi, S. I. (2009). A Comparison of Kernel Equating and Traditional Equipercentile Equating Methods and the Parametric Bootstrap Methods for Estimating Standard Errors in Equipercentile Equating (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign, United States.
Godfrey, K. E. (2007). A comparison of Kernel equating and IRT true score equating methods (Unpublished doctoral dissertation). The University of North Carolina, United States.
Grant, M. C., Zhang, L., & Damiano, M. (2009). An Evaluation of Kernel Equating: Parallel Equating with Classical Methods in the SAT Subject Tests [TM] Program. (ETS RR-09-06). Princeton, NJ: Educational Testing Service.
Holland, P. W., & Dorans, N. J. (2006). Linking and equating. In R. L. Brennan (Ed.), Educational measurement (pp. 187–220). Westport, CT: Praeger Publishers.
Holland, P., von Davier, A., Sinharay, S., & Han, N. (2006). Testing the untestable assumptions of the chain and post-stratification equating methods for the NEAT design (ETS RR-06-17). Princeton, NJ: Educational Testing Service.
Kolen, M. J. (1988). Traditional equating methodology. Educational Measurement: Issues and Practice, 7(4), 29-37. doi: 10.1111/j.1745-3992.1988.tb00843.x
Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer.
Lee, Y. H., & von Davier, A. A. (2010). Equating through alternative kernels. In A.A. von Davier (Ed.) Statistical models for test equating, scaling, and linking (pp. 159-173). Springer New York.
Liou, M., Cheng, P. E., & Johnson, E. G. (1997). Standard errors of the Kernel equating methods under the common-item design. Applied Psychological Measurement, 21(4), 349-369. doi: 10.1177/01466216970214005
Liu, J., & Low, A. C. (2007). An Exploration of Kernel Equating Using SAT® Data: Equating to a Similar Population and to a Distant Population. (ETS RR‐07‐17). Princeton, NJ: Educational Testing Service.
MEB (2016). TIMSS 2015 Uluslararası Matematik ve Fen Eğilimleri Araştrıması: TIMMS 2015 Ulusal Matematik ve Fen Bilimleri On Raporu 4. ve 8. Sınıflar. MEB Ölçme, Değerlendirme ve Sınav Hizmetleri Genel Müdürlüğü. Ankara. [Çevrim-içi http://timss.meb.gov.tr/wp-content/uploads/TIMSS_2015_Ulusal_Rapor.pdf] Erişim Tarihi: 15 Mart 2018.
Meng, Y. (2012). Comparison of Kernel Equating and Item Response Theory Equating Methods (Unpublished doctoral dissertation). University of Massachusetts Amherst, United States.
Mao, X. (2006). An investigation of the accuracy of the estimates of standard errors for the Kernel equating functions (Unpublished doctoral dissertation). University of Iowa, Iowa City, United States.
Mao, X., von Davier, A. A., & Rupp, S. (2006). Comparisons of the Kernel equating method with the traditional equating methods on PRAXISTM data (ETS RR-06-30). Princeton, NJ: Educational Testing Service.
Moses, T., & Holland, P. (2007). Kernel and traditional equipercentile equating with degrees of presmoothing (ETS RR-07-15). Princeton, NJ: Educational Testing Service.
Mullis, I. V. S., Cotter, K. E., Centurino, V. A. S., Fishbein, B. G., & Liu, J. (2016). Using Scale Anchoring to Interpret the TIMSS 2015 Achievement Scales. In M. O. Martin, I. V. S. Mullis, & M. Hooper (Eds.), Methods and Procedures in TIMSS 2015 (pp. 14.1-14.47). Retrieved from Boston College, TIMSS & PIRLS International Study Center website: http://timss.bc.edu/publications/timss/2015-methods/chapter-14.html
Norman Dvorak, R. K. (2009). A comparison of Kernel equating to the test characteristic curve methods (Unpublished doctoral dissertation). University of Nebraska, Lincoln, United States.
R Core Team. (2017). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria.
Ricker, K., & von Davier, A. A. (2007). The impact of anchor test length on equating results in a nonequivalent groups design (ETS RR-07-44). Princeton, NJ: Educational Testing Service.
von Davier, A., Holland, P. W., & Thayer, D. T. (2004). The Kernel method of equating. New York, NY: Springer.
von Davier, A. A., Holland, P. W., Livingston, S. A., Casabianca, J., Grant, M. C., & Martin, K. (2006). An evaluation of the Kernel equating method. A special study with pseudotests constructed from real test data (ETS RR-06-02). Princeton, NJ: Educational Testing Service.

Kernel Eşitleme Yöntemlerinin Denk Olmayan Gruplarda Ortak Madde Test Deseninde Karşılaştırılması

Year 2019, Volume: 19 Issue: 82, 27 - 44, 31.07.2019

Cigdem Akın Arıkan

Abstract

Problem Durumu: Eşitleme benzer içerik ve güçlük
düzeyinde geliştirilen test formları arasındaki farklılıkları düzenleyerek, bu
formlardan elde edilen puanların birbiri yerine kullanılmasını sağlayan
istatistiksel bir süreç olarak tanımlanabilir (Kolen, 1988). Test eşitleme
yöntemleri yaklaşık 100 yıldır psikometristlerin dikkatini çekmekte ve yeni
yöntemler geliştirilmektedir. Eşitleme yöntemleri eşit yüzdelikli eşitlemeye
dayalı yöntemler, doğrusal eşitleme yöntemleri, MTK gözlenen ve gerçek puan
eşitleme, van der Linden yerel eşitleme, Levine doğrusal olmayan metot ve yeni
bir yaklaşım olan Kernel eşitlemeyi kapsar (von Davier, 2013). Tek grup,
eşdeğer grup ve denk olamayan gruplarda
ortak madde test deseninde kullanılır (von Davier et al., 2004). Denk olmayan
gruplarda ortak madde deseni (Non-Equivalent groups Anchor Test-NEAT), test
güvenliği nedeniyle test formunun birden daha fazla uygulandığı durumlarda
kullanılır. NEAT deseninde, her iki formda ortak maddeler yer alır ve test
formları arasındaki eşitleme ilişkisi de ortak maddeler üzerinden kurulur
(Kolen ve Brennan, 2014). Kernel eşitleme doğrusal ve eşit yüzdelikli eşitleme
yöntemlerini içerir. NEAT deseninde zincirleme eştileme (doğrusal ve eşit
yüzdelikli), son tabakalama (eşit yüzdelikli ve doğrusal), Levine gözlenen puan
doğrusal eşitleme yöntemleri bulunmaktadır. Yeni bir yaklaşım olan Kernel
eşitleme yöntemlerinin geleneksel eşitle yöntemleri ve Madde Tepki Kuramı
eştilem yöntemleri ile karşılaştırıldığı çalışmalar bulunmakdadır. Bu
çalışmanın amacı ise, Türkiye’nin de yer aldığı TIMMS fen datasındaki Kernel
eşitleme yöntemlerine göre eşitlenmesidir.

Araştırmanın
Amacı: Bu araştırmanın amacı, TIMMS fen datasındaki 1. Ve 14. Kitapçıklarının
Kernel eşitleme yöntemlerinden zincirleme ve son tabakalama eşitleme
yöntemlerine göre eşitlenerek, en iyi eşitleme yönteminin belirlenmesidir.

Araştırmanın
Yöntemi: TIMSS 2015 araştırmasının yapıldığı dönemde Türkiye’de toplam 1.108.572
4. sınıf öğrencisi, 1.187.893 de 8. sınıf
öğrencisi bulunmaktadır. 6456, 4. sınıf öğrencisi ve 6079, 8. Sınıf öğrencisi
TIMMS uygulamasına katılmıştır.
Araştırmanın örneklemini ise Türkiye’deki TIMMS uygulamasına katılan 8
sınıf öğrenciler arasından, bu uygulama esnasında 1. ve 14. kitapçıkları alan
865 öğrenci oluşturmaktadır. Veri analizi için TIMMS 2015 uygulanmasına katılan
Türkiye’deki 8. sınıf öğrencilerin fen okuryazarlığı maddelerine verdiği cevap
örüntülerinden oluşan veri setinden yararlanılmıştır. Bu çalışmada TIMMS
uygulamasında yer alan 14 kitapçıktan 1 ve 14 nolu kitapçıklarda yer alan
maddeler kullanılmıştır. 4 nolu kitapçıkta 39, 14 nolu kitapçıkta 38 madde yer
almaktadır. Yanlış ve kayıp veriler 0 ve kısmi puanlanan ve doğru cevapların
hepsi 1 olarak kodlanarak analiz edilecek veri hazırlanmıştır. Verilerin analizinin birinci aşamasında,
Kernel zincirleme ve Kernel son tabakalama eşit yüzdelikli ve doğrusal eşitleme
yöntemlerine göre kitapçıklar eşitlenmiştir.
Daha sonra eşitleme yöntemleri DTM, PRE, SEE, SEED ve RMSD kriterlerine
göre değerlendirilmiştir.

Araştırmanın
Bulguları: Kernel zincirleme eşit yüzdelikli, zincirleme doğrusal, son tabakalama
doğrusal ve son tabakalama eşit yüzdelikli eşitleme yöntemlerine göre kitapçıklar eştilendiğind eilk olarak PRE
değerleri elde edilmiştir. KE zincirleem eşit yüzdelikli ve son tabakalama eşit
yüzdelikli eşitleme yöntemlerine datanın daha iyi uyum sağladığı elde
edilmiştir. Eşitleme yöntemleri karşılaştırıldığında, eşit yüzdelikli eşitleme
yöntemlerinin ve doğrusal eşitleme yöntemlerinin birbiriyle benzer sonuçlar
ürettiği ve aralarındaki farkın DTM’den küçük olduğu elde edilmiştir. Eşitleme
yöntemlerine göre SEE değerleri karşılaştırıldığında, orta puan ölçeğinde bu
değerlerin birbirlerine yakın olduğu görülmektedir. Uç puanlarda ise Kernel
eşit yüzdelikli eşitleme yöntemleri düşük, doğrusal eşitleme yöntemleri ise
yüksek standart hatalara sahip olduğu elde edilmiştir. Eşitleme yöntemlerine
göre SEED değerleri karşılaştırıldığında, eşitleme yöntemleri arasındaki farkın
DTM’den küçük olduğu ve ±2 SEED çizgisi arasında kaldığı bulunmuştur. Eşitleme
yöntemlerine karışan random hatayı değerlendirebilmek için RMSD katsayısı
hesaplanmıştır. En az random hata içeren eşitleme yöntem son tabakalama
eşitleme yönteminde iken en fazla random hata içeren yöntemin zincirleme
doğrusal eşitleme yönteminde olduğu elde edilmiştir.

Araştırmanın
Sonuçları ve Önerileri: Kernel eşitleme yöntemleri ortalama SEE açısından
karşılaştırıldığında, doğrusal eşitleme yöntemlerinin eşit yüzdelikli
yöntemlere göre daha yüksek ortalama SEE sahip olduğu bulunmuştur. Bu bulgu
Choi (2009) ve Liou ve diğerlerinin (1997) bulgularıyla tutarlı olmadığı
görülmektedir. Elde edilen bu sonuç diğer çalışmalarda simülasyon data veya
geniş örneklem büyüklüğünün kullanılmasından kaynaklı olabilir. Ayrıca KE
doğrusal eşitleme yöntemlerinde uç puanlarda orta puanlara göre daha yüksek
standart hata verdiği bulunmuştur. Bu bulgu literatürdeki çalışmaları
desteklemektedir. RMSD katsayıları karşılaştırıldığında en az random hata
içeren yöntem son tabakalama eşitleme yöntemi iken en fazla random hata içeren
yöntemin zincirleme doğrusal eşitleme olduğu görülmüştür. Elde edilen bu sonuçlardan hareketle, gelecek
çalışmalarda farklı kriterler kullanılarak farklı eşitleme yöntemleri
kullanılabilir ve bu çalışmanın sonuçlarıyla karşılaştırılabilir.

Keywords

eşit yüzdelikli , eşitleme , doğrusal , SEED , SEE

References

Andersson, B., Bränberg, K., & Wiberg, M. (2013). Performing the kernel method of test equating with the package kequate. Journal of Statistical Software, 55(6), 1-25.
Akin-Arikan, Ç. (2017). Kernel Eşitleme ve Madde Tepki Kuramına Dayalı Eşitleme Yöntemlerinin Karşılaştırılması [Comparison of kernel equating and item response theory equating methods] (Unpublished doctoral dissertation). Hacettepe University, Ankara, Turkey.
Akin-Arikan, Ç. & Gelbal, S. (2018). A Comparison of Traditional and Kernel Equating Methods. International Journal of Assessment Tools in Education, 5(3), 417-427. doi: 10.21449/ijate.409826
Choi, S. I. (2009). A Comparison of Kernel Equating and Traditional Equipercentile Equating Methods and the Parametric Bootstrap Methods for Estimating Standard Errors in Equipercentile Equating (Unpublished doctoral dissertation). University of Illinois at Urbana-Champaign, United States.
Godfrey, K. E. (2007). A comparison of Kernel equating and IRT true score equating methods (Unpublished doctoral dissertation). The University of North Carolina, United States.
Grant, M. C., Zhang, L., & Damiano, M. (2009). An Evaluation of Kernel Equating: Parallel Equating with Classical Methods in the SAT Subject Tests [TM] Program. (ETS RR-09-06). Princeton, NJ: Educational Testing Service.
Holland, P. W., & Dorans, N. J. (2006). Linking and equating. In R. L. Brennan (Ed.), Educational measurement (pp. 187–220). Westport, CT: Praeger Publishers.
Holland, P., von Davier, A., Sinharay, S., & Han, N. (2006). Testing the untestable assumptions of the chain and post-stratification equating methods for the NEAT design (ETS RR-06-17). Princeton, NJ: Educational Testing Service.
Kolen, M. J. (1988). Traditional equating methodology. Educational Measurement: Issues and Practice, 7(4), 29-37. doi: 10.1111/j.1745-3992.1988.tb00843.x
Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking: Methods and practices. New York: Springer.
Lee, Y. H., & von Davier, A. A. (2010). Equating through alternative kernels. In A.A. von Davier (Ed.) Statistical models for test equating, scaling, and linking (pp. 159-173). Springer New York.
Liou, M., Cheng, P. E., & Johnson, E. G. (1997). Standard errors of the Kernel equating methods under the common-item design. Applied Psychological Measurement, 21(4), 349-369. doi: 10.1177/01466216970214005
Liu, J., & Low, A. C. (2007). An Exploration of Kernel Equating Using SAT® Data: Equating to a Similar Population and to a Distant Population. (ETS RR‐07‐17). Princeton, NJ: Educational Testing Service.
MEB (2016). TIMSS 2015 Uluslararası Matematik ve Fen Eğilimleri Araştrıması: TIMMS 2015 Ulusal Matematik ve Fen Bilimleri On Raporu 4. ve 8. Sınıflar. MEB Ölçme, Değerlendirme ve Sınav Hizmetleri Genel Müdürlüğü. Ankara. [Çevrim-içi http://timss.meb.gov.tr/wp-content/uploads/TIMSS_2015_Ulusal_Rapor.pdf] Erişim Tarihi: 15 Mart 2018.
Meng, Y. (2012). Comparison of Kernel Equating and Item Response Theory Equating Methods (Unpublished doctoral dissertation). University of Massachusetts Amherst, United States.
Mao, X. (2006). An investigation of the accuracy of the estimates of standard errors for the Kernel equating functions (Unpublished doctoral dissertation). University of Iowa, Iowa City, United States.
Mao, X., von Davier, A. A., & Rupp, S. (2006). Comparisons of the Kernel equating method with the traditional equating methods on PRAXISTM data (ETS RR-06-30). Princeton, NJ: Educational Testing Service.
Moses, T., & Holland, P. (2007). Kernel and traditional equipercentile equating with degrees of presmoothing (ETS RR-07-15). Princeton, NJ: Educational Testing Service.
Mullis, I. V. S., Cotter, K. E., Centurino, V. A. S., Fishbein, B. G., & Liu, J. (2016). Using Scale Anchoring to Interpret the TIMSS 2015 Achievement Scales. In M. O. Martin, I. V. S. Mullis, & M. Hooper (Eds.), Methods and Procedures in TIMSS 2015 (pp. 14.1-14.47). Retrieved from Boston College, TIMSS & PIRLS International Study Center website: http://timss.bc.edu/publications/timss/2015-methods/chapter-14.html
Norman Dvorak, R. K. (2009). A comparison of Kernel equating to the test characteristic curve methods (Unpublished doctoral dissertation). University of Nebraska, Lincoln, United States.
R Core Team. (2017). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria.
Ricker, K., & von Davier, A. A. (2007). The impact of anchor test length on equating results in a nonequivalent groups design (ETS RR-07-44). Princeton, NJ: Educational Testing Service.
von Davier, A., Holland, P. W., & Thayer, D. T. (2004). The Kernel method of equating. New York, NY: Springer.
von Davier, A. A., Holland, P. W., Livingston, S. A., Casabianca, J., Grant, M. C., & Martin, K. (2006). An evaluation of the Kernel equating method. A special study with pseudotests constructed from real test data (ETS RR-06-02). Princeton, NJ: Educational Testing Service.

There are 24 citations in total.

Details

Primary Language	English
Journal Section	Articles
Authors	Cigdem Akın Arıkan
Publication Date	July 31, 2019
Published in Issue	Year 2019 Volume: 19 Issue: 82

Cite

APA	Akın Arıkan, C. (2019). A Comparison of Kernel Equating Methods Based on Neat Design. Eurasian Journal of Educational Research, 19(82), 27-44.
AMA	Akın Arıkan C. A Comparison of Kernel Equating Methods Based on Neat Design. Eurasian Journal of Educational Research. July 2019;19(82):27-44.
Chicago	Akın Arıkan, Cigdem. “A Comparison of Kernel Equating Methods Based on Neat Design”. Eurasian Journal of Educational Research 19, no. 82 (July 2019): 27-44.
EndNote	Akın Arıkan C (July 1, 2019) A Comparison of Kernel Equating Methods Based on Neat Design. Eurasian Journal of Educational Research 19 82 27–44.
IEEE	C. Akın Arıkan, “A Comparison of Kernel Equating Methods Based on Neat Design”, Eurasian Journal of Educational Research, vol. 19, no. 82, pp. 27–44, 2019.
ISNAD	Akın Arıkan, Cigdem. “A Comparison of Kernel Equating Methods Based on Neat Design”. Eurasian Journal of Educational Research 19/82 (July2019), 27-44.
JAMA	Akın Arıkan C. A Comparison of Kernel Equating Methods Based on Neat Design. Eurasian Journal of Educational Research. 2019;19:27–44.
MLA	Akın Arıkan, Cigdem. “A Comparison of Kernel Equating Methods Based on Neat Design”. Eurasian Journal of Educational Research, vol. 19, no. 82, 2019, pp. 27-44.
Vancouver	Akın Arıkan C. A Comparison of Kernel Equating Methods Based on Neat Design. Eurasian Journal of Educational Research. 2019;19(82):27-44.

Download Cover Image

Article Files

Full Text