A Statistical Comparison of Norm-Referenced Assessment Systems Used in Higher Education in Turkey

Erkan Hasan Atalmış

doi:10.21031/epod.487335

Research Article

Türkiye’de Yükseköğretimde Kullanılan Bağıl Değerlendirme Sistemlerinin İstatistiksel Olarak Karşılaştırılması

Year 2019, Volume: 10 Issue: 1, 12 - 29, 29.03.2019

Erkan Hasan Atalmış

https://doi.org/10.21031/epod.487335

Cited By: 1

Abstract

Bu çalışmanın
amacı Türkiye’de yükseköğretimde kullanılan farklı bağıl değerlendirme sistemlerini
tespit etmek ve bu sistemleri ampirik olarak karşılaştırmaktır. Bunun için
öncelikle Türkiye’de bağıl değerlendirme sistemi kullanan yaklaşık 70 devlet
üniversitesinin bağıl değerlendirme yönetmelikleri incelenmiş ve üniversiteler
kullandıkları bağıl değerlendirme sistemlerine göre 4 farklı grupta
sınıflandırılmıştır (en yaygın yöntem olarak kullanılan yöntem sadece T puan
dönüşümü uygulayan; T puan dönüşümü ve yüzdelik dilimi birlikte uygulayan; T
puan dönüşümü, yüzdelik dilim ve standart sapmayı birlikte uygulayan; sadece
standart sapma tabanlı bağıl değerlendirme sistemi uygulayan). T puan dönüşümü
uygulayan 2 üniversitenin ve diğer bağıl değerlendirme systemleri kullanan 3 üniversitenin
algoritmaları seçildikten sonra bu algoritmalar bir devlet üniversitesinde
okuyan 19574 öğrencinin her bir ders için dönem sonu ham başarı puanlarını
(HBP) harf notlarına ve 4’lük sistemdeki notlara dönüştürmek için
kullanılmıştır. Belirlenen bu üniversitelerde kullanılan bağıl değerlendirme
sistemlerinin farklılıklarını test etmek için hem mevcut kullandıkları mutlak
değerlendirme sitemiyle ve hem de birbirleriyle karşılaştırılmıştır. Mutlak ve bağıl değerlendirme arasındaki farkı
test etmek için eşleştirilmiş iki grup arasındaki farkların testi uygulanırken, bağıl
değerlendirme sistemleri arasındaki fark ise tek
yönlü varyans analizi ile hesaplanmıştır. Elde edilen sonuçlara göre bağıl
değerlendirme ile hesaplanan harf notların mutlak değerlendirme ile hesaplanan
notlara göre istatistiksel olarak anlamlı ve yüksek olduğu görülürken,
üniversitelerin bağıl değerlendirme sistemleri kullanılarak elde edilen harf
notları arasında istatistiksel olarak anlamlı olarak farklılık gösterdiği
bulunmuştur. Çalışmanın sonunda gerek
mutlak ve bağıl not sistemi farklılığı gerekse kullanılan bağıl değerlendirme
sistemleri arasındaki farklılığın sonuçları öğrenci ve öğretim elemanları
açısından tartışılmıştır.

Keywords

norm dayanaklı değerlendirme, ölçüt dayanaklı değerlendirme, yükseköğretimde değerlendirme, not verme sistemi

References

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Atılgan, H., Yurdakul, H., & Oğretmen, T. (2012). A research on the relative and absolute evaluation for determination of students achievement. Inonu University Journal of the Faculty of Education, 13(2), 79-98.
Başol, G.(2013). Eğitimde ölçme ve değerlendirme. Ankara: Pegem Akademi Yayınları.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Büyüköztürk, Ş. (2009). Sosyal bilimler için veri analizi el kitabı: İstatistik, araştırma deseni, SPSS uygulamaları ve yorum (9. baskı). Ankara: Pegem Yayınları.
Çelen, Ü., & Aybek, E. C. (2013). Öğrenci Başarısının Öğretmen Yapımı Bir Testle Klasik Test Kuramı ve Madde Tepki Kuramı Yöntemleriyle Elde Edilen Puanlara Göre Karşılaştırılması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 4(2), 64-75.
Demir, P. & Atalmış, E.A. (2017). Öğretmen Yazılı Sınav Sorularının Hess Bilişsel Zorluk Matrisine Göre İncelenmesi. Köse, Selçuk, & Atalmış (Eds.), Sosyo Ekonomik Stratejiler III – Eğitim içinde (s. 43-73). Londra: IJOPEC Publication.
Duman, B. (2011). The views of classroom teachers related to norm-referenced assessment. Education Sciences, 6(1), 536-548.
Downing, S. M. (2005). The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education. Advances in Health Sciences Education, 10(2), 133-143. DOI 10.1007/s10459-004-4019-5
Haladyna, T.M. & Rodriguez, M.C (2013). Developing and validating test items. New York: Routledge.
Hambleton, R. K., Swaminathan, H., Algina, J., & Coulson, D. B. (1978). Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational Research, 48(1), 1-47.
Johnson, V. E. (2003). Grade inflation: a crisis in college education. New York, NY: Springer.
Martin, I. G., & Jolly, B. (2002). Predictive validity and estimated cut score of an objective structured clinical examination (OSCE) used as an assessment of clinical skills at the end of the first clinical year. Medical education, 36(5), 418-425. https://doi.org/10.1046/j.1365-2923.2002.01207.x
Masters, J. C., Hulsmeyer, B. S., Pike, M. E., Leichty, K., Miller, M. T., & Verst, A. L. (2001). Assessment of multiple-choice questions in selected test banks accompanying text books used in nursing education. Journal of Nursing Education, 40(1), 25-32. https://doi.org/10.3928/0148-4834-20010101-07
Mehrens, W.A. & Lehmann, I.J. (1991). Measurement and Evaluation in Education and Psychology.New York: Harcourt Brace
Nartgün, Z., (2007). Aynı puanlar üzerinden yapılan mutlak ve bağıl değerlendirme uygulamalarının notlarda farklılık oluşturup oluşturmadığına ilişkin bir inceleme. Ege Eğitim Dergisi, 8 (1), 19- 40.
Sadler*, D. R. (2005). Interpretations of criteria‐based assessment and grading in higher education. Assessment & Evaluation in Higher Education, 30(2), 175-194. https://doi.org/10.1080/0260293042000264262
Sawilowsky, S (2009). New effect size rules of thumb. Journal of Modern Applied Statistical Methods, 8(2), 467–474.
Sayın, A. (2016). The Effect of using relative and absolute criteria to decide students’ passing or failing a Course. Journal of Education and Training Studies, 4(9), 1-9. http://dx.doi.org/10.11114/jets.v4i9.1571
Selvi, K. (1998). Üniversitelerde Uygulanan Başarı Değerlendirme Yaklaşımları. Kurgu Dergisi, 15, 336-345
Tan, Ş. (2015). Öğretim hedeflerinin belirlenmesi. Şeref Tan (Ed.), Öğretim ilke ve yöntemleri içinde (s.38-76). Ankara: Pegem Akademi Yayınları.
Tarrant, M., Knierim, A., Hayes, S. K., & Ware, J. (2006). The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments. Nurse education in practice, 6(6), 354-363. DOI:10.1016/j.nedt.2006.07.006
Thordike, R.M. (2005). Measurement and Evaluation in Psychology and Education (7th Ed.). Upper Saddle River, NJ: Pearson Education.
Turgut, M. F., & Baykul, Y. (2015). Eğitimde ölçme ve değerlendirme. Pegem Akademi Yayıları.
Yorke, M. (2011). Summative assessment: dealing with the ‘measurement fallacy’. Studies in Higher Education, 36(3), 251-273. https://doi.org/10.1080/03075070903545082
Yücel, C. (2015). Sınıf İçi Değerlendirme ve Not Verme. Emin Karip (Ed.), Ölçme ve Değerlendirme içinde (s. 324-361). Ankara: Pegem Akademi Yayınları.

A Statistical Comparison of Norm-Referenced Assessment Systems Used in Higher Education in Turkey

Year 2019, Volume: 10 Issue: 1, 12 - 29, 29.03.2019

Erkan Hasan Atalmış

https://doi.org/10.21031/epod.487335

Cited By: 1

Abstract

The purpose of this study is to identify different
norm-referenced assessment systems used in Turkish higher education, and to
compare them empirically. Norm-referenced assessment regulations of 70
universities in Turkey was primarily analyzed, and universities were divided
into four different groups depending on their norm-referenced assessment
systems (only applying T-score conversion, the most commonly used method;
applying T-score conversion and quantiles together; applying T-score
conversion, quantiles and standard deviation together; applying standard
deviation based norm-referenced assessment system). After the algorithms of two
universities applying T-score conversion and three universities applying other
norm-referenced assessment system were selected, they were used to convert
end-of-year grade for each course of 19,574 students in a state university into
letter grades and 4-point system. To test the differences of the
norm-referenced assessment systems used in these universities, the
norm-referenced system of a university were compared with the
criterion-referenced system of the same university as well as norm-referenced
systems of other universities. The paired t-test was used to identify the difference between norm-referenced and criterion-referenced assessment, while the
differences between norm-referenced assessment systems were
analyzed through one-way analysis of variance. The findings revealed that the
letter grades calculated through the norm-referenced assessment were
statistically different than the ones calculated with criterion-referenced;
besides, a statistically significant difference was identified between the
letter grades obtained using the norm-referenced assessment systems of
universities. At the end of the study, the findings were discussed in term of
students and instructors.

Keywords

Norm-referenced assessment, criterion-referenced assessment, assessment in higher education, grading system

References

AERA, APA, & NCME. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Atılgan, H., Yurdakul, H., & Oğretmen, T. (2012). A research on the relative and absolute evaluation for determination of students achievement. Inonu University Journal of the Faculty of Education, 13(2), 79-98.
Başol, G.(2013). Eğitimde ölçme ve değerlendirme. Ankara: Pegem Akademi Yayınları.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.
Büyüköztürk, Ş. (2009). Sosyal bilimler için veri analizi el kitabı: İstatistik, araştırma deseni, SPSS uygulamaları ve yorum (9. baskı). Ankara: Pegem Yayınları.
Çelen, Ü., & Aybek, E. C. (2013). Öğrenci Başarısının Öğretmen Yapımı Bir Testle Klasik Test Kuramı ve Madde Tepki Kuramı Yöntemleriyle Elde Edilen Puanlara Göre Karşılaştırılması. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 4(2), 64-75.
Demir, P. & Atalmış, E.A. (2017). Öğretmen Yazılı Sınav Sorularının Hess Bilişsel Zorluk Matrisine Göre İncelenmesi. Köse, Selçuk, & Atalmış (Eds.), Sosyo Ekonomik Stratejiler III – Eğitim içinde (s. 43-73). Londra: IJOPEC Publication.
Duman, B. (2011). The views of classroom teachers related to norm-referenced assessment. Education Sciences, 6(1), 536-548.
Downing, S. M. (2005). The effects of violating standard item writing principles on tests and students: the consequences of using flawed test items on achievement examinations in medical education. Advances in Health Sciences Education, 10(2), 133-143. DOI 10.1007/s10459-004-4019-5
Haladyna, T.M. & Rodriguez, M.C (2013). Developing and validating test items. New York: Routledge.
Hambleton, R. K., Swaminathan, H., Algina, J., & Coulson, D. B. (1978). Criterion-referenced testing and measurement: A review of technical issues and developments. Review of Educational Research, 48(1), 1-47.
Johnson, V. E. (2003). Grade inflation: a crisis in college education. New York, NY: Springer.
Martin, I. G., & Jolly, B. (2002). Predictive validity and estimated cut score of an objective structured clinical examination (OSCE) used as an assessment of clinical skills at the end of the first clinical year. Medical education, 36(5), 418-425. https://doi.org/10.1046/j.1365-2923.2002.01207.x
Masters, J. C., Hulsmeyer, B. S., Pike, M. E., Leichty, K., Miller, M. T., & Verst, A. L. (2001). Assessment of multiple-choice questions in selected test banks accompanying text books used in nursing education. Journal of Nursing Education, 40(1), 25-32. https://doi.org/10.3928/0148-4834-20010101-07
Mehrens, W.A. & Lehmann, I.J. (1991). Measurement and Evaluation in Education and Psychology.New York: Harcourt Brace
Nartgün, Z., (2007). Aynı puanlar üzerinden yapılan mutlak ve bağıl değerlendirme uygulamalarının notlarda farklılık oluşturup oluşturmadığına ilişkin bir inceleme. Ege Eğitim Dergisi, 8 (1), 19- 40.
Sadler*, D. R. (2005). Interpretations of criteria‐based assessment and grading in higher education. Assessment & Evaluation in Higher Education, 30(2), 175-194. https://doi.org/10.1080/0260293042000264262
Sawilowsky, S (2009). New effect size rules of thumb. Journal of Modern Applied Statistical Methods, 8(2), 467–474.
Sayın, A. (2016). The Effect of using relative and absolute criteria to decide students’ passing or failing a Course. Journal of Education and Training Studies, 4(9), 1-9. http://dx.doi.org/10.11114/jets.v4i9.1571
Selvi, K. (1998). Üniversitelerde Uygulanan Başarı Değerlendirme Yaklaşımları. Kurgu Dergisi, 15, 336-345
Tan, Ş. (2015). Öğretim hedeflerinin belirlenmesi. Şeref Tan (Ed.), Öğretim ilke ve yöntemleri içinde (s.38-76). Ankara: Pegem Akademi Yayınları.
Tarrant, M., Knierim, A., Hayes, S. K., & Ware, J. (2006). The frequency of item writing flaws in multiple-choice questions used in high stakes nursing assessments. Nurse education in practice, 6(6), 354-363. DOI:10.1016/j.nedt.2006.07.006
Thordike, R.M. (2005). Measurement and Evaluation in Psychology and Education (7th Ed.). Upper Saddle River, NJ: Pearson Education.
Turgut, M. F., & Baykul, Y. (2015). Eğitimde ölçme ve değerlendirme. Pegem Akademi Yayıları.
Yorke, M. (2011). Summative assessment: dealing with the ‘measurement fallacy’. Studies in Higher Education, 36(3), 251-273. https://doi.org/10.1080/03075070903545082
Yücel, C. (2015). Sınıf İçi Değerlendirme ve Not Verme. Emin Karip (Ed.), Ölçme ve Değerlendirme içinde (s. 324-361). Ankara: Pegem Akademi Yayınları.

There are 26 citations in total.

Details

Primary Language	English
Journal Section	Articles
Authors	Erkan Hasan Atalmış 0000-0001-9610-491X
Publication Date	March 29, 2019
Acceptance Date	January 28, 2019
Published in Issue	Year 2019 Volume: 10 Issue: 1

Cite

APA	Atalmış, E. H. (2019). A Statistical Comparison of Norm-Referenced Assessment Systems Used in Higher Education in Turkey. Journal of Measurement and Evaluation in Education and Psychology, 10(1), 12-29. https://doi.org/10.21031/epod.487335

Cited By

The complexity of the grading system in Turkish higher education

International Journal of Assessment Tools in Education

https://doi.org/10.21449/ijate.1266808

Download Cover Image

Article Files

Full Text