PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması

Şeyma Uyar; Burcu Aksekioğlu; Neşe Öztürk Gübeş

doi:10.21764/maeuefd.330613

TR EN

PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması

Öz

Bu çalışmada farklı ölçek dönüştürme yöntemlerini PISA 2012 matematik okuryazarlığı verileri üzerinde karşılaştırmak amaçlanmıştır. Bu amaçla seçilen iki kitapçıktan elde edilen puanlar madde tepki kuramına dayalı ölçek dönüştürme (ortalama-ortalama, ortalama-standart sapma, Stocking-Lord, Haebara) ve test eşitleme yöntemleri (MTK gerçek-puan eşitleme, MTK gözlenen-puan eşitleme) kullanılarak eşitlenmiş ve farklı yöntemlerden elde edilen sonuçlar incelenmiştir. Çalışma, 4 ve 11 numaralı kitapçıklardaki matematik testlerine verilen cevaplar kullanılarak yürütülmüştür. Bu nedenle araştırmanın çalışma grubunu Türkiye örnekleminde 4 numaralı kitapçığı cevaplayan 348 ve 11 numaralı kitapçığı cevaplayan 368 olmak üzere toplam 716 öğrenci oluşturmaktadır. Çalışmada test eşitleme için “denk olmayan gruplarda ortak madde deseni” kullanılmıştır. Verilerin analizinin ilk aşamasında madde tepki kuramının tek boyutluluk varsayımı test edilmiştir. Ardından PARSCALE 4.1 programı ile madde ve yetenek parametreleri kestirilmiştir. Parametre kestiriminde iki-parametreli lojistik model ve genelleştirilmiş kısmi kredi modeli kullanılmıştır. Daha sonra STUIRT programı ile dört farklı yöntem kullanılarak ölçek dönüştürme işlemi yapılmıştır. Son aşamada ise her iki formdan elde edilen test puanları POLYEQUATE programı ile eşitlenmiştir. Farklı yöntemlerden elde edilen hata miktarları ise ağırlıklandırılmış hata kareleri ortalaması (WMSE) ile hesaplanmıştır. Çalışma sonucunda, en az hata miktarına sahip yöntemin gerçek-puan eşitlemede Stocking-Lord, gözlenen-puan eşitlemede ise Haebara yönteminin olduğu bulunmuştur. En yüksek eşitleme hatasını ise ortalama-standart sapma yönteminin verdiği tespit edilmiştir.

Anahtar Kelimeler

test eşitleme,karma test,ölçek dönüştürme yöntemleri,eşitleme hatası

Comparison of Different Scale Linking Methods in PISA 2012 Mathematics Literacy Test

Öz

In this study, the objective was to compare different scale linking methods over the PISA 2012 mathematics literacy data. For this purpose, scores obtained from two selected booklets were equated using scale linking (mean-mean, mean-sigma, Stocking-Lord, Haebara) and test equating methods (IRT true-score equating, IRT observed-score equating) based on the item response theory, and results obtained from different methods were analyzed. The study was conducted using answers given to mathematics tests in booklet-4 and booklet-11. Therefore, the sample consists of 716 students in Turkey; 348 of these participants are the takers of booklet-4, 368 of them are the takers of booklet-11. In order to equate test forms “the common-item nonequivalent groups” design was used in this research. In the first stage of data analysis, unidimensionality assumption of the item response theory was analysed. Then PARSCALE 4.1 was used to estimate item and ability parameters. Generalized partial credit and two-parameter logistic model were used to estimate parameters. Afterwards STUIRT program was used for scale linking for four different methods. In the last step test scores obtained from different forms were equated by using POLYEQUATE program. Equating error obtained from different methods calculated with weighted mean squares error (WMSE) index. Results showed that Stocking-Lord method had the smallest equating error in true-score equating and Haebara method had the smallest equating error in observed-score equating. The amount of maximum error has been established that of the mean-sigma method.

Anahtar Kelimeler

test equating,mixed test,scale linking methods,equating error

Kaynakça

Angoff, W. H. (1984). Scales, norms and equivalent scores. Princeton, New Jersey: Educational Testing Service.
Baker, F. B. & Al-Karni, A. (1991). A comparison of two procedures for computing IRT equating coefficients. Journal of Educational Measurement, 28 (2), 147- 162.
Büyüköztürk, Ş., Çokluk, Ö. & Köklü, N. (2013). Sosyal bilimler için istatistik (12. Baskı). Ankara: Pegem Akademi.
Cohen, A. S. & Kim, S. H. (1998). An investigation of linking methods under the graded response model. Applied Psychological Measurement, 22(2), 116-130.
Cook L. & Eignor D. R. (1991). NCME instructional module: IRT equating methods. Educational Measurement: Issues and Practices, 10(3), 37-45.
Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. USA: Harcourt Brace Jovanovich College.
Çokluk, Ö., Şekercioğlu, G. & Büyüköztürk, Ş. (2014). Sosyal bilimler için çok değişkenli istatistik: SPSS ve LISREL uygulamaları (3. Baskı). Ankara: Pegem Yayıncılık.
De Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.

Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. London: Lawrence Erlbaum Associates Publishers.
Felan, G. D. (2002, February). Test equating: mean, linear, equipercentile and item response theory. Paper presented at the Annual Meeting of the Southwest Educational Research Association, Austin, Texas. French, D. J. (1996). The utility of Stocking-Lord’s equating procedure for equating norm-referenced and criterion-referenced tests with both dichotomous and plytomous components. Unpublished doctorate dissertation, University of Texas, Texas.
Gök, B. (2012). Denk olmayan gruplarda ortak madde deseni kullanılarak madde tepki kuramına dayalı eşitleme yöntemlerinin karşılaştırılması. Yayımlanmamış doktora tezi, Hacettepe Üniversitesi, Ankara.
Gültekin, S. (2014). Testlerde kullanılacak madde türleri, hazırlama ilkeleri ve puanlaması. N. Demirtaşlı (Ed.), Eğitimde ölçme ve değerlendirme içinde (2. Baskı). Ankara: Edge Akademi.
Haebara, T. (1980). Equating lojistic ability scales by a weighted least squares method. Japanese Psychological Research, 22(3), 144-149.
Hagge, S. L. (2010). The impact of equating method and format representation of common items on the adequacy of mixed format test equating using nonequivalent groups. Unpublished doctorate dissertation, University of Lowa, Lowa City.
Hambleton, R. K. (1989). Item response theory: introduction and bibliography. (Rapor no:196) Amherst: University of Massachusetts.
Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: principles and applications. Boston: Kluwer, Nijhoff Publishing.
Han, T., Kolen, M. & Pohlmann, J. (1997). A comparison among IRT true and observed-score equatings and traditional equipercentile equating. Applied Measurement in Education, 10(2), 105-121, DOI: 10.1207/s15324818ame10021.
Hanson, B. A. & Beguin, A. A. (2002). Obtaining a common scale for item response theory item parameters using seperate versus concurrent estimation in the common-item equating design. Applied Psychological Measurement, 26 (3), 3-24.
Harris, D. J. & Crouse, J. D. (1993). A study of criteria used in equating. Applied Measurement in Education, 6 (3), 195-240.
Holland, P. W., Dorans, N. J. & Petersen, N. S. (2007). Equating test scores. In C. R. Rao, S. Sinharay (Eds.), Handbook of statistics: Pschometrics (pp. 169-197). Amsterdam: Elsevier B. V.
Jöreskog, K. G. & Sorbön, D. (1986). LISREL 8.7: Prells a program for multivariate data screening and data summarization [Computer software]. Mooresville, Ind: Scientific Software Inc.
Kilmen, S. (2010). Madde tepki kuramına dayalı test eşitleme yöntemlerinden kestirilen eşitleme hatalarının örneklem büyüklüğü ve yetenek dağılımına göre karşılaştırılması. Yayımlanmamış doktora tezi, Ankara Üniversitesi, Ankara.
Kim, H. K. (2006). The effect of repeaters on equating: A population invariance approach. Unpublished doctorate dissertation, The University of Lowa, Lowa City.
Kim, S. & Cohen, A. S. (2002). A comparison of linking and concurrent calibration under the graded response model. Applied Psychological Measurement, 26 (1), 25-41.
Kim, S. & Kolen, M. J. (2004). STUIRT: A computer program for scale transformation under unidimentional item response theory models [Computer software]. Lowa City, IA. The Center for Advanced Studies in Measurement and Assessment (CASMA), The University of Lowa.
Kim, S. & Kolen, M. J. (2006). Robustness to format effects of IRT linking methods for mixed-format tests. Applied Measurement in Education, 19 (4), 357-381.
Kim, S. & Kolen, M. J. (2007). Effects on scale linking of different definitions of criterion functions for he IRT characteristic curve methods. Journal of Educational and Behavioral Statistics 32(4), 371-397.
Kim, S. & Lee, W. (2004). IRT scale linking methods for mixed-format tests. (ACT Research Report 2004-5). Lowa City, IA: Act, Inc.
Kim, S. & Lee, W. (2006). IRT scale linking methods for mixed-format tests (ACT Research Report 2004-5). Lowa City, IA: Act, Inc.
Kolen, M. J. (1981). Comparison of traditional and item response theory methods for equating tests. Journal of Educational Measurement, 18 (1), 1-11.
Kolen, M. J. (1988). An NCME instructional module on traditional equating methodology. Educational Measurement: Issues and Practice, 7(4), 29-36.
Kolen, M. J. (2004). POLYEQUATE windows console version [Computer software]. Lowa City IA: The Center for Advanced Studies in Measurement and Assessment (CASMA), The University of Lowa.
Kolen, M. J. & Brennan, R. L. (1995). Test equating: Methods and practices. New York: Springer.
Kolen, M. J. & Brennan, R. L. (2004). Test equating, scalling and linking (2nd ed.). New York: Springer.
Kolen, M. J. & Brennan, R. L. (2014). Test equating, scaling and linking: Methods and practices (3rd ed.). New York: Springer.
Kubiszyn, T. & Borich, G. D. (2013). Educational testing and measurement: Classroom application and practice (10th ed.). New Jersey: Wiley.
Lee, W. & Ban, J. (2010). A comparison of IRT linking procedures. Applied Measurement in Education 23(1), 23-48.
Li, Y. H., Lissitz R. W. & Yang, Y. N. (1999, April). Estimating IRT equating coefficients for tests with poltomously and dichotomously scored items. Paper presented at Annual Meeting of The National Council on Measurement in Education, Montreal, Canada. Lord F. M. & Wingersky M. S. (1984). Comparison of IRT true-score and equipercentile observed score equatings. Applied Psychological Measurement, 8, 452–461.
Lorenzo-Seva, U. & Ferrando, P. J. (2006). FAKTOR 10.4 [Computer software]. Tarragona: Universitat Rovira i Virgili.
Loyd, B. H. & Hoover, H. D. (1980). Vertical equating using the rasch model. Journal of Educational Measurement, 17(3), 179-193.
Marco, G. L. (1977). Item characteristic curve solutions to three intracteble testing problems. Journal of Educational Measurement, 14(2), 139-160.
MEB (2013). PISA 2012 ulusal ön raporu. Ankara: Sebit.
Muraki, E. & Bock, R. D. (2003). PARSCALE 4.1 [Computer software]. Chicago, IL: Scientific Software International, Inc.
OECD (2009). PISA Data Analysis Manual: SPSS (Second Edition). PISA, OECD Publishing, DOI: 10.1787/9789264056275-en.
Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51(1), 1-23.
Ogasawara, H. (2001). Standart errors of item response theory equating / linking by response function methods. Applied Psychological Measurement, 25 (1), 53- 67.
Öztürk-Gübeş, N. & Kelecioğlu, H. (2016). The impact of test dimensionality, common-item set format, and scale linking methods on mixed-format test equating. Educational Sciences: Theory and Practice, 16, 715-734.
Öztürk-Gübeş, N. & Kelecioğlu, H. (2015). Karma testlerin eşitlenmesinde MTK eşitleme yöntemlerinin eşitlik özelliği ölçütüne göre karşılaştırılması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6(1), 117-125.
Petersen, N. S., Kolen, M. J. & Hoover, H. D. (1989). Scaling, norming and equating. In R. L. Linn (Ed.), Educational measurement (pp. 221-262). New York: Macmillan.
Sinharay, S. & Hollland, P. W. (2010). A new approach to comparing several equating methods in the context of the NEAT design. Journal of Educational Measurement, 47(3), 261-285.
Skaggs, G & Lissitz, R. (1982, March) Test equating: relevant ıssues and a review of recent research. Paper presented at the Annual Meeting of the American Educational Research Association, Los Angeles, California.
Speron, E. (2009). A comparison of metric linking procedures in item response theory. Unpublished doctorate dissertation, IIIinois Institute of Technology, Chicago.
Stocking, M. L. & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7(2), 201-2010.
Sönmez, V., & Alacapınar, F. G. (2016). Örneklendirilmiş bilimsel araştırma yöntemleri (4. Baskı). Ankara: Anı Yayıncılık.
Tanguma, J. (2000, January). Equating test scores using the linear method: a primer. Paper presented at the Annual Meeting of the Southwest Educational Research Association, Dallas, Texas.
Tate, R. (2000). Performance of a proposed method for he linking of mixed-format tests with constructed response and multiple choice items. Journal of Educational Measurement, 37(4), 329-346.
Tsai, T., Hanson, B. A.; Kolen, M. J. & Forsyth, R. A. (2001). A comparison of bootstrap standard errors of IRT equating methods for the common-item nonequivalent groups design. Applied Measurement in Education, 14(1), 17-30, DOI: 10.1207/S15324818AME1401_03.
Uysal, İ. (2014). Madde tepki kuramına dayalı test eşitleme yöntemlerinin karma modeller üzerinde karşılaştırılması. Yayımlanmamış yüksek lisans tezi, Abant İzzet Baysal Üniversitesi, Bolu.
Yang, W. L. & Houang, R. T. (1996, April). The effect of anchor length and equating method on the accuracy of test equating comparisons of linear and IRT-based equating using an anchor-item design. Paper presented at American Educational Research Association, New York, USA.
Zhu, W. (1998). Test equating: What, why and how? Research Quarterly for Exercises and Sport, 69(1), 11–23.

Ayrıntılar

Birincil Dil

Türkçe

Konular

-

Bölüm

Araştırma Makalesi

Yazarlar

Şeyma Uyar
MEHMET AKİF ERSOY ÜNİVERSİTESİ
Türkiye

Burcu Aksekioğlu
MEHMET AKİF ERSOY ÜNİVERSİTESİ
Türkiye

Neşe Öztürk Gübeş
MEHMET AKİF ERSOY ÜNİVERSİTESİ
Türkiye

Yayımlanma Tarihi

19 Nisan 2018

Gönderilme Tarihi

24 Temmuz 2017

Kabul Tarihi

12 Mart 2018

Yayımlandığı Sayı

Yıl 2018 Sayı: 46

DOI

https://doi.org/10.21764/maeuefd.330613

IZ

https://izlik.org/JA62CR78XD

Kaynak Göster

RIS / Bibtex

APA

Uyar, Ş., Aksekioğlu, B., & Öztürk Gübeş, N. (2018). PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması. Mehmet Akif Ersoy University Journal of Education Faculty, 46, 121-148. https://doi.org/10.21764/maeuefd.330613

AMA

1.Uyar Ş, Aksekioğlu B, Öztürk Gübeş N. PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması. Mehmet Akif Ersoy University Journal of Education Faculty. 2018;(46):121-148. doi:10.21764/maeuefd.330613

Chicago

Uyar, Şeyma, Burcu Aksekioğlu, ve Neşe Öztürk Gübeş. 2018. “PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması”. Mehmet Akif Ersoy University Journal of Education Faculty, sy 46: 121-48. https://doi.org/10.21764/maeuefd.330613.

EndNote

Uyar Ş, Aksekioğlu B, Öztürk Gübeş N (01 Nisan 2018) PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması. Mehmet Akif Ersoy University Journal of Education Faculty 46 121–148.

IEEE

[1]Ş. Uyar, B. Aksekioğlu, ve N. Öztürk Gübeş, “PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması”, Mehmet Akif Ersoy University Journal of Education Faculty, sy 46, ss. 121–148, Nis. 2018, doi: 10.21764/maeuefd.330613.

ISNAD

Uyar, Şeyma - Aksekioğlu, Burcu - Öztürk Gübeş, Neşe. “PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması”. Mehmet Akif Ersoy University Journal of Education Faculty. 46 (01 Nisan 2018): 121-148. https://doi.org/10.21764/maeuefd.330613.

JAMA

1.Uyar Ş, Aksekioğlu B, Öztürk Gübeş N. PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması. Mehmet Akif Ersoy University Journal of Education Faculty. 2018;:121–148.

MLA

Uyar, Şeyma, vd. “PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması”. Mehmet Akif Ersoy University Journal of Education Faculty, sy 46, Nisan 2018, ss. 121-48, doi:10.21764/maeuefd.330613.

Vancouver

1.Şeyma Uyar, Burcu Aksekioğlu, Neşe Öztürk Gübeş. PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması. Mehmet Akif Ersoy University Journal of Education Faculty. 01 Nisan 2018;(46):121-48. doi:10.21764/maeuefd.330613

Cited By

Madde tepki kuramına dayalı test eşitlemede ölçek dönüştürme yöntemlerinin, ortak madde oranının ve madde ayırt ediciliğinin eşitleme hatasına etkisi

Mehmet Akif Ersoy Üniversitesi Eğitim Fakültesi Dergisi

https://doi.org/10.21764/maeuefd.1366213