PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması

Şeyma Uyar; Burcu Aksekioğlu; Neşe Öztürk Gübeş

doi:10.21764/maeuefd.330613

Araştırma Makalesi

PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması

Yıl 2018, Sayı: 46, 121 - 148, 19.04.2018

Şeyma Uyar , Burcu Aksekioğlu , Neşe Öztürk Gübeş

https://doi.org/10.21764/maeuefd.330613

Öz

Bu çalışmada farklı ölçek
dönüştürme yöntemlerini PISA 2012 matematik okuryazarlığı verileri üzerinde
karşılaştırmak amaçlanmıştır. Bu amaçla seçilen iki kitapçıktan elde edilen
puanlar madde tepki kuramına dayalı ölçek dönüştürme (ortalama-ortalama,
ortalama-standart sapma, Stocking-Lord, Haebara) ve test eşitleme yöntemleri
(MTK gerçek-puan eşitleme, MTK gözlenen-puan eşitleme) kullanılarak eşitlenmiş
ve farklı yöntemlerden elde edilen sonuçlar incelenmiştir. Çalışma, 4 ve 11
numaralı kitapçıklardaki matematik testlerine verilen cevaplar kullanılarak
yürütülmüştür. Bu nedenle araştırmanın çalışma grubunu Türkiye örnekleminde 4
numaralı kitapçığı cevaplayan 348 ve 11 numaralı kitapçığı cevaplayan 368 olmak
üzere toplam 716 öğrenci oluşturmaktadır. Çalışmada test eşitleme için “denk
olmayan gruplarda ortak madde deseni” kullanılmıştır. Verilerin analizinin ilk
aşamasında madde tepki kuramının tek boyutluluk varsayımı test edilmiştir.
Ardından PARSCALE 4.1 programı ile madde ve yetenek parametreleri
kestirilmiştir. Parametre kestiriminde iki-parametreli lojistik model ve
genelleştirilmiş kısmi kredi modeli kullanılmıştır. Daha sonra STUIRT programı
ile dört farklı yöntem kullanılarak ölçek dönüştürme işlemi yapılmıştır. Son
aşamada ise her iki formdan elde edilen test puanları POLYEQUATE programı ile
eşitlenmiştir. Farklı yöntemlerden elde edilen hata miktarları ise
ağırlıklandırılmış hata kareleri ortalaması (WMSE) ile hesaplanmıştır. Çalışma
sonucunda, en az hata miktarına sahip yöntemin gerçek-puan eşitlemede
Stocking-Lord, gözlenen-puan eşitlemede ise Haebara yönteminin olduğu
bulunmuştur. En yüksek eşitleme hatasını ise ortalama-standart sapma yönteminin
verdiği tespit edilmiştir.

Anahtar Kelimeler

test eşitleme, karma test, ölçek dönüştürme yöntemleri, eşitleme hatası

Kaynakça

Angoff, W. H. (1984). Scales, norms and equivalent scores. Princeton, New Jersey: Educational Testing Service.
Baker, F. B. & Al-Karni, A. (1991). A comparison of two procedures for computing IRT equating coefficients. Journal of Educational Measurement, 28 (2), 147- 162.
Büyüköztürk, Ş., Çokluk, Ö. & Köklü, N. (2013). Sosyal bilimler için istatistik (12. Baskı). Ankara: Pegem Akademi.
Cohen, A. S. & Kim, S. H. (1998). An investigation of linking methods under the graded response model. Applied Psychological Measurement, 22(2), 116-130.
Cook L. & Eignor D. R. (1991). NCME instructional module: IRT equating methods. Educational Measurement: Issues and Practices, 10(3), 37-45.
Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. USA: Harcourt Brace Jovanovich College.
Çokluk, Ö., Şekercioğlu, G. & Büyüköztürk, Ş. (2014). Sosyal bilimler için çok değişkenli istatistik: SPSS ve LISREL uygulamaları (3. Baskı). Ankara: Pegem Yayıncılık.
De Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.
Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. London: Lawrence Erlbaum Associates Publishers.
Felan, G. D. (2002, February). Test equating: mean, linear, equipercentile and item response theory. Paper presented at the Annual Meeting of the Southwest Educational Research Association, Austin, Texas. French, D. J. (1996). The utility of Stocking-Lord’s equating procedure for equating norm-referenced and criterion-referenced tests with both dichotomous and plytomous components. Unpublished doctorate dissertation, University of Texas, Texas.
Gök, B. (2012). Denk olmayan gruplarda ortak madde deseni kullanılarak madde tepki kuramına dayalı eşitleme yöntemlerinin karşılaştırılması. Yayımlanmamış doktora tezi, Hacettepe Üniversitesi, Ankara.
Gültekin, S. (2014). Testlerde kullanılacak madde türleri, hazırlama ilkeleri ve puanlaması. N. Demirtaşlı (Ed.), Eğitimde ölçme ve değerlendirme içinde (2. Baskı). Ankara: Edge Akademi.
Haebara, T. (1980). Equating lojistic ability scales by a weighted least squares method. Japanese Psychological Research, 22(3), 144-149.
Hagge, S. L. (2010). The impact of equating method and format representation of common items on the adequacy of mixed format test equating using nonequivalent groups. Unpublished doctorate dissertation, University of Lowa, Lowa City.
Hambleton, R. K. (1989). Item response theory: introduction and bibliography. (Rapor no:196) Amherst: University of Massachusetts.
Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: principles and applications. Boston: Kluwer, Nijhoff Publishing.
Han, T., Kolen, M. & Pohlmann, J. (1997). A comparison among IRT true and observed-score equatings and traditional equipercentile equating. Applied Measurement in Education, 10(2), 105-121, DOI: 10.1207/s15324818ame10021.
Hanson, B. A. & Beguin, A. A. (2002). Obtaining a common scale for item response theory item parameters using seperate versus concurrent estimation in the common-item equating design. Applied Psychological Measurement, 26 (3), 3-24.
Harris, D. J. & Crouse, J. D. (1993). A study of criteria used in equating. Applied Measurement in Education, 6 (3), 195-240.
Holland, P. W., Dorans, N. J. & Petersen, N. S. (2007). Equating test scores. In C. R. Rao, S. Sinharay (Eds.), Handbook of statistics: Pschometrics (pp. 169-197). Amsterdam: Elsevier B. V.
Jöreskog, K. G. & Sorbön, D. (1986). LISREL 8.7: Prells a program for multivariate data screening and data summarization [Computer software]. Mooresville, Ind: Scientific Software Inc.
Kilmen, S. (2010). Madde tepki kuramına dayalı test eşitleme yöntemlerinden kestirilen eşitleme hatalarının örneklem büyüklüğü ve yetenek dağılımına göre karşılaştırılması. Yayımlanmamış doktora tezi, Ankara Üniversitesi, Ankara.
Kim, H. K. (2006). The effect of repeaters on equating: A population invariance approach. Unpublished doctorate dissertation, The University of Lowa, Lowa City.
Kim, S. & Cohen, A. S. (2002). A comparison of linking and concurrent calibration under the graded response model. Applied Psychological Measurement, 26 (1), 25-41.
Kim, S. & Kolen, M. J. (2004). STUIRT: A computer program for scale transformation under unidimentional item response theory models [Computer software]. Lowa City, IA. The Center for Advanced Studies in Measurement and Assessment (CASMA), The University of Lowa.
Kim, S. & Kolen, M. J. (2006). Robustness to format effects of IRT linking methods for mixed-format tests. Applied Measurement in Education, 19 (4), 357-381.
Kim, S. & Kolen, M. J. (2007). Effects on scale linking of different definitions of criterion functions for he IRT characteristic curve methods. Journal of Educational and Behavioral Statistics 32(4), 371-397.
Kim, S. & Lee, W. (2004). IRT scale linking methods for mixed-format tests. (ACT Research Report 2004-5). Lowa City, IA: Act, Inc.
Kim, S. & Lee, W. (2006). IRT scale linking methods for mixed-format tests (ACT Research Report 2004-5). Lowa City, IA: Act, Inc.
Kolen, M. J. (1981). Comparison of traditional and item response theory methods for equating tests. Journal of Educational Measurement, 18 (1), 1-11.
Kolen, M. J. (1988). An NCME instructional module on traditional equating methodology. Educational Measurement: Issues and Practice, 7(4), 29-36.
Kolen, M. J. (2004). POLYEQUATE windows console version [Computer software]. Lowa City IA: The Center for Advanced Studies in Measurement and Assessment (CASMA), The University of Lowa.
Kolen, M. J. & Brennan, R. L. (1995). Test equating: Methods and practices. New York: Springer.
Kolen, M. J. & Brennan, R. L. (2004). Test equating, scalling and linking (2nd ed.). New York: Springer.
Kolen, M. J. & Brennan, R. L. (2014). Test equating, scaling and linking: Methods and practices (3rd ed.). New York: Springer.
Kubiszyn, T. & Borich, G. D. (2013). Educational testing and measurement: Classroom application and practice (10th ed.). New Jersey: Wiley.
Lee, W. & Ban, J. (2010). A comparison of IRT linking procedures. Applied Measurement in Education 23(1), 23-48.
Li, Y. H., Lissitz R. W. & Yang, Y. N. (1999, April). Estimating IRT equating coefficients for tests with poltomously and dichotomously scored items. Paper presented at Annual Meeting of The National Council on Measurement in Education, Montreal, Canada. Lord F. M. & Wingersky M. S. (1984). Comparison of IRT true-score and equipercentile observed score equatings. Applied Psychological Measurement, 8, 452–461.
Lorenzo-Seva, U. & Ferrando, P. J. (2006). FAKTOR 10.4 [Computer software]. Tarragona: Universitat Rovira i Virgili.
Loyd, B. H. & Hoover, H. D. (1980). Vertical equating using the rasch model. Journal of Educational Measurement, 17(3), 179-193.
Marco, G. L. (1977). Item characteristic curve solutions to three intracteble testing problems. Journal of Educational Measurement, 14(2), 139-160.
MEB (2013). PISA 2012 ulusal ön raporu. Ankara: Sebit.
Muraki, E. & Bock, R. D. (2003). PARSCALE 4.1 [Computer software]. Chicago, IL: Scientific Software International, Inc.
OECD (2009). PISA Data Analysis Manual: SPSS (Second Edition). PISA, OECD Publishing, DOI: 10.1787/9789264056275-en.
Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51(1), 1-23.
Ogasawara, H. (2001). Standart errors of item response theory equating / linking by response function methods. Applied Psychological Measurement, 25 (1), 53- 67.
Öztürk-Gübeş, N. & Kelecioğlu, H. (2016). The impact of test dimensionality, common-item set format, and scale linking methods on mixed-format test equating. Educational Sciences: Theory and Practice, 16, 715-734.
Öztürk-Gübeş, N. & Kelecioğlu, H. (2015). Karma testlerin eşitlenmesinde MTK eşitleme yöntemlerinin eşitlik özelliği ölçütüne göre karşılaştırılması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6(1), 117-125.
Petersen, N. S., Kolen, M. J. & Hoover, H. D. (1989). Scaling, norming and equating. In R. L. Linn (Ed.), Educational measurement (pp. 221-262). New York: Macmillan.
Sinharay, S. & Hollland, P. W. (2010). A new approach to comparing several equating methods in the context of the NEAT design. Journal of Educational Measurement, 47(3), 261-285.
Skaggs, G & Lissitz, R. (1982, March) Test equating: relevant ıssues and a review of recent research. Paper presented at the Annual Meeting of the American Educational Research Association, Los Angeles, California.
Speron, E. (2009). A comparison of metric linking procedures in item response theory. Unpublished doctorate dissertation, IIIinois Institute of Technology, Chicago.
Stocking, M. L. & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7(2), 201-2010.
Sönmez, V., & Alacapınar, F. G. (2016). Örneklendirilmiş bilimsel araştırma yöntemleri (4. Baskı). Ankara: Anı Yayıncılık.
Tanguma, J. (2000, January). Equating test scores using the linear method: a primer. Paper presented at the Annual Meeting of the Southwest Educational Research Association, Dallas, Texas.
Tate, R. (2000). Performance of a proposed method for he linking of mixed-format tests with constructed response and multiple choice items. Journal of Educational Measurement, 37(4), 329-346.
Tsai, T., Hanson, B. A.; Kolen, M. J. & Forsyth, R. A. (2001). A comparison of bootstrap standard errors of IRT equating methods for the common-item nonequivalent groups design. Applied Measurement in Education, 14(1), 17-30, DOI: 10.1207/S15324818AME1401_03.
Uysal, İ. (2014). Madde tepki kuramına dayalı test eşitleme yöntemlerinin karma modeller üzerinde karşılaştırılması. Yayımlanmamış yüksek lisans tezi, Abant İzzet Baysal Üniversitesi, Bolu.
Yang, W. L. & Houang, R. T. (1996, April). The effect of anchor length and equating method on the accuracy of test equating comparisons of linear and IRT-based equating using an anchor-item design. Paper presented at American Educational Research Association, New York, USA.
Zhu, W. (1998). Test equating: What, why and how? Research Quarterly for Exercises and Sport, 69(1), 11–23.

Comparison of Different Scale Linking Methods in PISA 2012 Mathematics Literacy Test

Yıl 2018, Sayı: 46, 121 - 148, 19.04.2018

Şeyma Uyar , Burcu Aksekioğlu , Neşe Öztürk Gübeş

https://doi.org/10.21764/maeuefd.330613

Öz

In this study, the
objective was to compare different scale linking methods over the PISA 2012
mathematics literacy data. For this purpose, scores obtained from two selected
booklets were equated using scale linking (mean-mean, mean-sigma,
Stocking-Lord, Haebara) and test equating methods (IRT true-score equating, IRT
observed-score equating) based on the item response theory, and results
obtained from different methods were analyzed. The study was conducted using
answers given to mathematics tests in booklet-4 and booklet-11. Therefore, the
sample consists of 716 students in Turkey; 348 of these participants are the
takers of booklet-4, 368 of them are the takers of booklet-11. In order to equate test forms “the common-item
nonequivalent groups” design was used in this research. In the first
stage of data analysis, unidimensionality assumption of the item response
theory was analysed. Then PARSCALE
4.1 was used to estimate item and ability parameters. Generalized partial
credit and two-parameter logistic model were used to estimate parameters.
Afterwards STUIRT program was used
for scale linking for four different methods. In the last step
test scores obtained from different forms were equated by using POLYEQUATE
program. Equating error obtained from different methods calculated with
weighted mean squares error (WMSE) index. Results
showed that Stocking-Lord method had the smallest equating error
in true-score equating and Haebara method had the smallest equating error in
observed-score equating. The amount of maximum error has been established that
of the mean-sigma method.

Anahtar Kelimeler

test equating, mixed test, scale linking methods, equating error

Kaynakça

Angoff, W. H. (1984). Scales, norms and equivalent scores. Princeton, New Jersey: Educational Testing Service.
Baker, F. B. & Al-Karni, A. (1991). A comparison of two procedures for computing IRT equating coefficients. Journal of Educational Measurement, 28 (2), 147- 162.
Büyüköztürk, Ş., Çokluk, Ö. & Köklü, N. (2013). Sosyal bilimler için istatistik (12. Baskı). Ankara: Pegem Akademi.
Cohen, A. S. & Kim, S. H. (1998). An investigation of linking methods under the graded response model. Applied Psychological Measurement, 22(2), 116-130.
Cook L. & Eignor D. R. (1991). NCME instructional module: IRT equating methods. Educational Measurement: Issues and Practices, 10(3), 37-45.
Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. USA: Harcourt Brace Jovanovich College.
Çokluk, Ö., Şekercioğlu, G. & Büyüköztürk, Ş. (2014). Sosyal bilimler için çok değişkenli istatistik: SPSS ve LISREL uygulamaları (3. Baskı). Ankara: Pegem Yayıncılık.
De Ayala, R. J. (2009). The theory and practice of item response theory. New York: The Guilford Press.
Embretson, S. E. & Reise, S. P. (2000). Item response theory for psychologists. London: Lawrence Erlbaum Associates Publishers.
Felan, G. D. (2002, February). Test equating: mean, linear, equipercentile and item response theory. Paper presented at the Annual Meeting of the Southwest Educational Research Association, Austin, Texas. French, D. J. (1996). The utility of Stocking-Lord’s equating procedure for equating norm-referenced and criterion-referenced tests with both dichotomous and plytomous components. Unpublished doctorate dissertation, University of Texas, Texas.
Gök, B. (2012). Denk olmayan gruplarda ortak madde deseni kullanılarak madde tepki kuramına dayalı eşitleme yöntemlerinin karşılaştırılması. Yayımlanmamış doktora tezi, Hacettepe Üniversitesi, Ankara.
Gültekin, S. (2014). Testlerde kullanılacak madde türleri, hazırlama ilkeleri ve puanlaması. N. Demirtaşlı (Ed.), Eğitimde ölçme ve değerlendirme içinde (2. Baskı). Ankara: Edge Akademi.
Haebara, T. (1980). Equating lojistic ability scales by a weighted least squares method. Japanese Psychological Research, 22(3), 144-149.
Hagge, S. L. (2010). The impact of equating method and format representation of common items on the adequacy of mixed format test equating using nonequivalent groups. Unpublished doctorate dissertation, University of Lowa, Lowa City.
Hambleton, R. K. (1989). Item response theory: introduction and bibliography. (Rapor no:196) Amherst: University of Massachusetts.
Hambleton, R. K. & Swaminathan, H. (1985). Item response theory: principles and applications. Boston: Kluwer, Nijhoff Publishing.
Han, T., Kolen, M. & Pohlmann, J. (1997). A comparison among IRT true and observed-score equatings and traditional equipercentile equating. Applied Measurement in Education, 10(2), 105-121, DOI: 10.1207/s15324818ame10021.
Hanson, B. A. & Beguin, A. A. (2002). Obtaining a common scale for item response theory item parameters using seperate versus concurrent estimation in the common-item equating design. Applied Psychological Measurement, 26 (3), 3-24.
Harris, D. J. & Crouse, J. D. (1993). A study of criteria used in equating. Applied Measurement in Education, 6 (3), 195-240.
Holland, P. W., Dorans, N. J. & Petersen, N. S. (2007). Equating test scores. In C. R. Rao, S. Sinharay (Eds.), Handbook of statistics: Pschometrics (pp. 169-197). Amsterdam: Elsevier B. V.
Jöreskog, K. G. & Sorbön, D. (1986). LISREL 8.7: Prells a program for multivariate data screening and data summarization [Computer software]. Mooresville, Ind: Scientific Software Inc.
Kilmen, S. (2010). Madde tepki kuramına dayalı test eşitleme yöntemlerinden kestirilen eşitleme hatalarının örneklem büyüklüğü ve yetenek dağılımına göre karşılaştırılması. Yayımlanmamış doktora tezi, Ankara Üniversitesi, Ankara.
Kim, H. K. (2006). The effect of repeaters on equating: A population invariance approach. Unpublished doctorate dissertation, The University of Lowa, Lowa City.
Kim, S. & Cohen, A. S. (2002). A comparison of linking and concurrent calibration under the graded response model. Applied Psychological Measurement, 26 (1), 25-41.
Kim, S. & Kolen, M. J. (2004). STUIRT: A computer program for scale transformation under unidimentional item response theory models [Computer software]. Lowa City, IA. The Center for Advanced Studies in Measurement and Assessment (CASMA), The University of Lowa.
Kim, S. & Kolen, M. J. (2006). Robustness to format effects of IRT linking methods for mixed-format tests. Applied Measurement in Education, 19 (4), 357-381.
Kim, S. & Kolen, M. J. (2007). Effects on scale linking of different definitions of criterion functions for he IRT characteristic curve methods. Journal of Educational and Behavioral Statistics 32(4), 371-397.
Kim, S. & Lee, W. (2004). IRT scale linking methods for mixed-format tests. (ACT Research Report 2004-5). Lowa City, IA: Act, Inc.
Kim, S. & Lee, W. (2006). IRT scale linking methods for mixed-format tests (ACT Research Report 2004-5). Lowa City, IA: Act, Inc.
Kolen, M. J. (1981). Comparison of traditional and item response theory methods for equating tests. Journal of Educational Measurement, 18 (1), 1-11.
Kolen, M. J. (1988). An NCME instructional module on traditional equating methodology. Educational Measurement: Issues and Practice, 7(4), 29-36.
Kolen, M. J. (2004). POLYEQUATE windows console version [Computer software]. Lowa City IA: The Center for Advanced Studies in Measurement and Assessment (CASMA), The University of Lowa.
Kolen, M. J. & Brennan, R. L. (1995). Test equating: Methods and practices. New York: Springer.
Kolen, M. J. & Brennan, R. L. (2004). Test equating, scalling and linking (2nd ed.). New York: Springer.
Kolen, M. J. & Brennan, R. L. (2014). Test equating, scaling and linking: Methods and practices (3rd ed.). New York: Springer.
Kubiszyn, T. & Borich, G. D. (2013). Educational testing and measurement: Classroom application and practice (10th ed.). New Jersey: Wiley.
Lee, W. & Ban, J. (2010). A comparison of IRT linking procedures. Applied Measurement in Education 23(1), 23-48.
Li, Y. H., Lissitz R. W. & Yang, Y. N. (1999, April). Estimating IRT equating coefficients for tests with poltomously and dichotomously scored items. Paper presented at Annual Meeting of The National Council on Measurement in Education, Montreal, Canada. Lord F. M. & Wingersky M. S. (1984). Comparison of IRT true-score and equipercentile observed score equatings. Applied Psychological Measurement, 8, 452–461.
Lorenzo-Seva, U. & Ferrando, P. J. (2006). FAKTOR 10.4 [Computer software]. Tarragona: Universitat Rovira i Virgili.
Loyd, B. H. & Hoover, H. D. (1980). Vertical equating using the rasch model. Journal of Educational Measurement, 17(3), 179-193.
Marco, G. L. (1977). Item characteristic curve solutions to three intracteble testing problems. Journal of Educational Measurement, 14(2), 139-160.
MEB (2013). PISA 2012 ulusal ön raporu. Ankara: Sebit.
Muraki, E. & Bock, R. D. (2003). PARSCALE 4.1 [Computer software]. Chicago, IL: Scientific Software International, Inc.
OECD (2009). PISA Data Analysis Manual: SPSS (Second Edition). PISA, OECD Publishing, DOI: 10.1787/9789264056275-en.
Ogasawara, H. (2000). Asymptotic standard errors of IRT equating coefficients using moments. Economic Review (Otaru University of Commerce), 51(1), 1-23.
Ogasawara, H. (2001). Standart errors of item response theory equating / linking by response function methods. Applied Psychological Measurement, 25 (1), 53- 67.
Öztürk-Gübeş, N. & Kelecioğlu, H. (2016). The impact of test dimensionality, common-item set format, and scale linking methods on mixed-format test equating. Educational Sciences: Theory and Practice, 16, 715-734.
Öztürk-Gübeş, N. & Kelecioğlu, H. (2015). Karma testlerin eşitlenmesinde MTK eşitleme yöntemlerinin eşitlik özelliği ölçütüne göre karşılaştırılması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 6(1), 117-125.
Petersen, N. S., Kolen, M. J. & Hoover, H. D. (1989). Scaling, norming and equating. In R. L. Linn (Ed.), Educational measurement (pp. 221-262). New York: Macmillan.
Sinharay, S. & Hollland, P. W. (2010). A new approach to comparing several equating methods in the context of the NEAT design. Journal of Educational Measurement, 47(3), 261-285.
Skaggs, G & Lissitz, R. (1982, March) Test equating: relevant ıssues and a review of recent research. Paper presented at the Annual Meeting of the American Educational Research Association, Los Angeles, California.
Speron, E. (2009). A comparison of metric linking procedures in item response theory. Unpublished doctorate dissertation, IIIinois Institute of Technology, Chicago.
Stocking, M. L. & Lord, F. M. (1983). Developing a common metric in item response theory. Applied Psychological Measurement, 7(2), 201-2010.
Sönmez, V., & Alacapınar, F. G. (2016). Örneklendirilmiş bilimsel araştırma yöntemleri (4. Baskı). Ankara: Anı Yayıncılık.
Tanguma, J. (2000, January). Equating test scores using the linear method: a primer. Paper presented at the Annual Meeting of the Southwest Educational Research Association, Dallas, Texas.
Tate, R. (2000). Performance of a proposed method for he linking of mixed-format tests with constructed response and multiple choice items. Journal of Educational Measurement, 37(4), 329-346.
Tsai, T., Hanson, B. A.; Kolen, M. J. & Forsyth, R. A. (2001). A comparison of bootstrap standard errors of IRT equating methods for the common-item nonequivalent groups design. Applied Measurement in Education, 14(1), 17-30, DOI: 10.1207/S15324818AME1401_03.
Uysal, İ. (2014). Madde tepki kuramına dayalı test eşitleme yöntemlerinin karma modeller üzerinde karşılaştırılması. Yayımlanmamış yüksek lisans tezi, Abant İzzet Baysal Üniversitesi, Bolu.
Yang, W. L. & Houang, R. T. (1996, April). The effect of anchor length and equating method on the accuracy of test equating comparisons of linear and IRT-based equating using an anchor-item design. Paper presented at American Educational Research Association, New York, USA.
Zhu, W. (1998). Test equating: What, why and how? Research Quarterly for Exercises and Sport, 69(1), 11–23.

Toplam 60 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Bölüm	Makaleler
Yazarlar	Şeyma Uyar Burcu Aksekioğlu Neşe Öztürk Gübeş
Yayımlanma Tarihi	19 Nisan 2018
Gönderilme Tarihi	24 Temmuz 2017
Yayımlandığı Sayı	Yıl 2018 Sayı: 46

Kaynak Göster

APA	Uyar, Ş., Aksekioğlu, B., & Öztürk Gübeş, N. (2018). PISA 2012 Matematik Okuryazarlığı Testinde Farklı Ölçek Dönüştürme Yöntemlerinin Karşılaştırılması. Mehmet Akif Ersoy Üniversitesi Eğitim Fakültesi Dergisi(46), 121-148. https://doi.org/10.21764/maeuefd.330613

Makale Dosyaları

Tam Metin

Mehmet Akif Ersoy Üniversitesi Eğitim Fakültesi Dergisi

33574