Reliability and Validity of The Measurements Done by Using Students’ and Teachers’ Evaluation Forms in Ege University School of Medicine

Kevser Vatansever; Şöhret Aydemir; Hilal Batı; Cenk Can; Mahmut Çoker; Selda Erensoy Erensoy; Figen Gövsa; Özen Başoğlu; Lütfiye Kanıt; Nilgün Kültürsay; Oktay Nazlı; Eser Sözmen; Sıla Elif Törün; Meltem Çiçeklioğlu

doi:10.25282/ted.555238

Research Article

Ege Üniversitesi Tıp Fakültesi Öğrenci ve Öğretim Üyesi Değerlendirme Formları ile Yapılan Ölçümlere İlişkin Geçerlilik - Güvenilirlik

Year 2020, Volume: 19 Issue: 57, 37 - 54, 30.04.2020

Kevser Vatansever , Şöhret Aydemir Hilal Batı Cenk Can , Mahmut Çoker Selda Erensoy Erensoy Figen Gövsa Özen Başoğlu Lütfiye Kanıt Nilgün Kültürsay , Oktay Nazlı , Eser Sözmen Sıla Elif Törün Meltem Çiçeklioğlu

https://doi.org/10.25282/ted.555238

Abstract

Giriş

Tıp eğitiminde program değerlendirme programların geliştirilmesi sürecinde rehberlik edecek verileri sağlar. Kullanılan değerlendirme araçlarının geçerli, güvenilir, düşük maliyetli olması ve öğrenci, eğitici, mezun vb gruplardan görüş almak için kullanılabilmesi beklenir.

Program değerlendirme etkinlikleri, Ege Üniversitesi Tıp Fakültesi’nde 2001 yılında başlayan program geliştirme çalışmalarının önemli bir bileşeni olmuştur.

Gereç ve Yöntem

Güncel öğrenci ve öğretim üyesi değerlendirme formlarıyla elde edilen sonuçların güvenirlik ve geçerliği, metodolojik tasarım tipinde bir çalışmayla değerlendirilmiştir. İç tutarlılık güvenirliğinin belirlenmesinde Cronbach alpha katsayısı hesaplanmıştır. Kapsam geçerliğini belirlemek için bir uzman panelinde madde kapsam geçerlilik oranları hesaplanmıştır. Yapı geçerliğinin belirlenmesi için açımlayıcı faktör analizi uygulanmıştır.

Puanlayıcılar arası uyum (interrater agreement) ve puanlayıcılar arası güvenilirlik (interrater reliability) araştırılmıştır. Puanlayıcılar arası güvenilirliğe ilişkin sınıf içi korelasyon katsayısı (intraclass correlation coefficient) için iki yönlü karma etki modeli kullanılarak % 95 güven aralığında tutarlık araştırılmıştır.

Sonuç

Cronbach alpha değerlerinin İkinci ve Üçüncü Sınıf Blok Değerlendirme formunun iki faktörü dışında 0,7 üzerinde bulunması, güncellenen öğrenci ve öğretim üyesi değerlendirme formlarının sonuçlarıyla ilgili iç tutarlık güvenirliğini desteklemektedir. Yapı geçerliği analizinde öğrenci formları, İkinci ve Üçüncü Sınıf Blok Değerlendirme Formu dışında tek boyutlu, Öğretim Üyesi Değerlendirme Formu ise üç faktörlü yapıya sahip bulunmuştur.

İkinci ve Üçüncü Sınıf Blok Değerlendirme formunda Faktör II, Öğretim Üyesi Blok ve Staj Değerlendirme Formunda ise Faktör III için puanlayıcılar arası uyum ve güvenilirlik katsayılarının kabul edilebilir düzeylerin altında olması bu boyutlar için yapılan ölçüme ilişkin güvenilirliği desteklememektedir.

Bu çalışmanın sonuçları, EÜTF program değerlendirme sisteminde kullanılan güncel değerlendirme formlarıyla yapılan ölçümlerin geçerlik ve güvenirliğini destekleyen kanıtların yanı sıra, sonuçların yorum ve kullanımında dikkatli olunması gereken noktaları da ortaya koymuştur.

Keywords

program değerlendirme , iç tutarlılık , yapı geçerliliği , öğrenci değerlendirme formu , öğretim üyesi değerlendirme formu

References

1. Morrison, J. ABC of learning and teaching in medicine: Evaluation. BMJ: British Medical Journal. 2003; 326 (7385):385.
2. Kogan JR, & Shea JA. Course evaluation in medical education. Teaching and Teacher Education. 2007; 23 (3):251-64.
3. Cook DA. Twelve tips for evaluating educational programs. Medical Teacher. 2010; 32 (4):296-301.
4. Dulski L, Kelly M, Carroll VS. Program outcome data: What do we measure? What does it mean? How does it lead to improvement? Quality Management in Health Care. 2006; 15(4):296-9.
5. UTEAK. MÖTE standartları 2015. http://tepdad.org.tr/uploads/files/Belgeler%20ve% 20formlar/MOTE_STANDARTLAR2015.pdf adresinden 24.05.2018 tarihinde erişilmiştir.
6. Şencan H. Sosyal ve davranışsal ölçümlerde güvenirlik ve geçerlilik. Ankara: Seçkin Yayıncılık; 2006.
7. Durak HI, Vatansever K, van Dalen J, & van der Vleuten C. Factors determining students’ global satisfaction with clerkships: an analysis of a two year students’ ratings database. Advances in health sciences education. 2008; 13(4):495-502.
8. Ercan İ, & Kan İ. Ölçeklerde güvenirlik ve geçerlik. Uludağ Üniversitesi Tıp Fakültesi Dergisi 2004; 30(3):211-216.
9. Yurdugül H. Ölçek geliştirme çalışmalarında kapsam geçerliği için kapsam geçerlik indekslerinin kullanılması. XIV. Ulusal Eğitim Bilimleri Kongresi DENİZLİ, Pamukkale Üniversitesi Eğitim Fakültesi, 28–30 Eylül 2005:771-774.
10. Bushnell DS. Input, process, output: A model for evaluating training. Training and Development Journal. 1990; 44 (3):41-3.
11. EÜTF Program Değerlendirme Kurulu. İşaret Listesi Projesi. İzmir; Ege Üniversitesi Tıp Fakültesi; 2003.
12. EÜTF Program Değerlendirme Komisyonu. Program Değerlendirme Çalışmaları 2003-2004. İzmir; Ege Üniversitesi Tıp Fakültesi: 2004.
13. Wagner SM, Rau C, Lindemann E. Multiple informant methodology: a critical review and recommendations Sociological Methods & Research. 2010; 38 (4):582-618.
14. Cranton P, & Smith RA. Reconsidering the unit of analysis: A model of student ratings of instruction. Journal of Educational Psychology. 1990; 82:207-212.
15. Ludtke O, Robitzsch A, Trautwein U, & Kunter M. Assessing the impact of learning environments: How to use student ratings of classroom or school characteristics in multilevel modeling. Contemporary Educational Psychology. 2009; 34:120–131.
16. Marsh HW. Students’ evaluations of university teaching: dimensionality, reliability, validity, potential biases and usefulness. In Perry RP & Smart JC. (Eds.), The scholarship of teaching and learning in higher education. Dordrecht: Springer; 2007:319–383.
17. Morley D. (2014) Assessing the reliability of student evaluations of teaching: choosing the right coefficient, Assessment & Evaluation in Higher Education. 2014; 39 (2):127-139.
18. Clayson DE. Student evaluation of teaching and matters of reliability, Assessment & Evaluation in Higher Education 2018; 43:4:666-681.
19. Nelson PM, Christ TJ. Reliability and agreement in student ratings of the class environment. School Psychology Quarterly. 2016; 31 (3):419-430.
20. James L, Demaree R, and Wolf G. "Estimating within-group interrater reliability with and without response bias." Journal of Applied Psychology. 1984; 69(1):85-98.
21. Biemann T, Cole MS, Voelpel S. Within-group agreement: On the use (and misuse) of rWG and rWG(J) in leadership research and some best practice guidelines. Leadership Quarterly. 2012;23:66-80.
22. Burke MJ, Dunlap WP. Estimating interrater agreement with the average deviation index: a user’s guide. Organizational Research Methods. 2002; 5 (2):159-172.
23. LeBreton JM, Senter JL. Answers to twenty questions about interrater reliability and interrater agreement. Organizational Research Methods. 2008;11:815–52.
24. Koo TK, Li MYA. Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. Journal of Chiropractic Medicine. 2016; 15(2):155–63.
25. Biemann T, Cole MS. An Excel 2007 Tool for Computing Interrater Agreement (IRA) & Interrater Reliability (IRR) Estimates Version 1.5. 2014. 05.07.2018 tarihinde http://www.sbuweb.tcu.edu/mscole/docs/Tool%20for%20Computing%20IRA%20and%20IRR%20Estimates_v1.5.zip adresinden erişildi.
26. EÜTF Plan Grubu. Klinik öncesi dönem eğitimi programını yatay ve dikey entegrasyon zemininde yeniden yapılandırma çalışmaları rehberi ve 2002-2003 yılı çalışma takvimi İzmir; Ege Üniversitesi Tıp Fakültesi: 2002.
27. Aker S, Dündar C, Pekşen Y. Ölçme Araçlarında iki Yaşamsal Kavram: Geçerlik ve Güvenirlik: Derleme. Deneysel ve Klinik Tıp Dergisi. 2005; 22(1):50-60.
28. Çelen Ü. Klasik test kuramı ve madde tepki kuramı yöntemleriyle geliştirilen iki testin geçerlilik ve güvenilirliğinin karşılaştırılması. İlköğretim Online 2008; 7(3):758-68.
29. Güler N. Rasgele veriler üzerinde Genellenebilirlik Kuramı ve Klasik Test Kuramı’na göre güvenirliğin karşılaştırılması. Eğitim ve Bilim. 2011; 36(162):225-34.
30. Beckman TJ, Ghosh AK, Cook DA, Erwin PJ, & Mandrekar JN. How reliable are assessments of clinical teaching?. Journal of General Internal Medicine. 2004; 19(9): 971-7.
31. Kogan JR, and Shea JA. Course evaluation in medical education. Teaching and Teacher Education. 2007; 23(3):251-64.
32. Cohen R, MacRae H & Jamieson C. Teaching effectiveness of surgeons. The American Journal of Surgery. 1996; 171(6):612-4.
33. Kalender İ. Reliability-Related Issues in the Context of Student Evaluations of Teaching in Higher Education. International Journal of Higher Education. 2015; 4(3):44-56.
34. Solomon DJ, Speer AJ, Rosebraugh CJ, & DiPette DJ. The reliability of medical student ratings of clinical teaching. Evaluation & the health professions. 1997; 2(3):343-52.
35. Cook DA & Beckman TJ. Current concepts in validity and reliability for psychometric instruments: theory and application. The American journal of medicine 2006; 119(2): 166-e7-e16.
36. Rantanen P. The number of feedbacks needed for reliable evaluation: A multilevel analysis of the reliability, stability and generalisability of students’ evaluation of teaching. Assessment & Evaluation in Higher Education. 2013; 38(2): 224-39.

Reliability and Validity of The Measurements Done by Using Students’ and Teachers’ Evaluation Forms in Ege University School of Medicine

Year 2020, Volume: 19 Issue: 57, 37 - 54, 30.04.2020

https://doi.org/10.25282/ted.555238

Abstract

Introduction

Program evaluation in medical education provides data that guide the program development process. It is expected that evaluation instruments are valid, reliable and low-cost, and useful for obtaining opinions of different groups such as student, teacher or graduate.

Program evaluation has been a crucial component of program development in Ege University School of Medicine since 2001.

Materials and Methods

In this methodological design type study, reliability and validity of the results, obtained by updated students’ and teachers’ evaluation forms, were assessed. Cronbach alpha coefficient was calculated for assessing internal consistency reliability. Item content validity ratios were calculated in an expert panel. Exploratory factor analysis was implemented for determining construct validity.

Interrater agreement and interrater reliability related to interpretation and use of the judgments of different observers were analyzed. Consistency was examined by a two-way mixed-effect model at 95% confidence interval, for calculating the intraclass correlation coefficient (ICC) related to interrater reliability.

Results

Cronbach alpha coefficients that were over 0,7, except for Second and Third Year Student Block Evaluation Form, corroborated the internal consistency reliability for updated student and teacher evaluation forms. In construct analysis, student forms, except the Second and Third Year Block Evaluation Form were found to have a one-dimension construct, while teacher forms were determined to have a three-dimensions construct.

Reliability of measurement of Factor II of Second and Third Year Block Student Forms and Factor III of Teachers’ Evaluation Forms was not corroborated as the interrater agreement and reliability coefficients were below acceptable levels.

Results of this study revealed the evidences that support the validity and reliability of measurements done by the current evaluation forms used in the program evaluation system of Ege University School of Medicine, and also detected the points to be paid attention while interpreting and using the results of the measurements.

Keywords

program evaluation , internal consistency , construct validity , student evaluation form , teacher evaluation form

References

1. Morrison, J. ABC of learning and teaching in medicine: Evaluation. BMJ: British Medical Journal. 2003; 326 (7385):385.
2. Kogan JR, & Shea JA. Course evaluation in medical education. Teaching and Teacher Education. 2007; 23 (3):251-64.
3. Cook DA. Twelve tips for evaluating educational programs. Medical Teacher. 2010; 32 (4):296-301.
4. Dulski L, Kelly M, Carroll VS. Program outcome data: What do we measure? What does it mean? How does it lead to improvement? Quality Management in Health Care. 2006; 15(4):296-9.
5. UTEAK. MÖTE standartları 2015. http://tepdad.org.tr/uploads/files/Belgeler%20ve% 20formlar/MOTE_STANDARTLAR2015.pdf adresinden 24.05.2018 tarihinde erişilmiştir.
6. Şencan H. Sosyal ve davranışsal ölçümlerde güvenirlik ve geçerlilik. Ankara: Seçkin Yayıncılık; 2006.
7. Durak HI, Vatansever K, van Dalen J, & van der Vleuten C. Factors determining students’ global satisfaction with clerkships: an analysis of a two year students’ ratings database. Advances in health sciences education. 2008; 13(4):495-502.
8. Ercan İ, & Kan İ. Ölçeklerde güvenirlik ve geçerlik. Uludağ Üniversitesi Tıp Fakültesi Dergisi 2004; 30(3):211-216.
9. Yurdugül H. Ölçek geliştirme çalışmalarında kapsam geçerliği için kapsam geçerlik indekslerinin kullanılması. XIV. Ulusal Eğitim Bilimleri Kongresi DENİZLİ, Pamukkale Üniversitesi Eğitim Fakültesi, 28–30 Eylül 2005:771-774.
10. Bushnell DS. Input, process, output: A model for evaluating training. Training and Development Journal. 1990; 44 (3):41-3.
11. EÜTF Program Değerlendirme Kurulu. İşaret Listesi Projesi. İzmir; Ege Üniversitesi Tıp Fakültesi; 2003.
12. EÜTF Program Değerlendirme Komisyonu. Program Değerlendirme Çalışmaları 2003-2004. İzmir; Ege Üniversitesi Tıp Fakültesi: 2004.
13. Wagner SM, Rau C, Lindemann E. Multiple informant methodology: a critical review and recommendations Sociological Methods & Research. 2010; 38 (4):582-618.
14. Cranton P, & Smith RA. Reconsidering the unit of analysis: A model of student ratings of instruction. Journal of Educational Psychology. 1990; 82:207-212.
15. Ludtke O, Robitzsch A, Trautwein U, & Kunter M. Assessing the impact of learning environments: How to use student ratings of classroom or school characteristics in multilevel modeling. Contemporary Educational Psychology. 2009; 34:120–131.
16. Marsh HW. Students’ evaluations of university teaching: dimensionality, reliability, validity, potential biases and usefulness. In Perry RP & Smart JC. (Eds.), The scholarship of teaching and learning in higher education. Dordrecht: Springer; 2007:319–383.
17. Morley D. (2014) Assessing the reliability of student evaluations of teaching: choosing the right coefficient, Assessment & Evaluation in Higher Education. 2014; 39 (2):127-139.
18. Clayson DE. Student evaluation of teaching and matters of reliability, Assessment & Evaluation in Higher Education 2018; 43:4:666-681.
19. Nelson PM, Christ TJ. Reliability and agreement in student ratings of the class environment. School Psychology Quarterly. 2016; 31 (3):419-430.
20. James L, Demaree R, and Wolf G. "Estimating within-group interrater reliability with and without response bias." Journal of Applied Psychology. 1984; 69(1):85-98.
21. Biemann T, Cole MS, Voelpel S. Within-group agreement: On the use (and misuse) of rWG and rWG(J) in leadership research and some best practice guidelines. Leadership Quarterly. 2012;23:66-80.
22. Burke MJ, Dunlap WP. Estimating interrater agreement with the average deviation index: a user’s guide. Organizational Research Methods. 2002; 5 (2):159-172.
23. LeBreton JM, Senter JL. Answers to twenty questions about interrater reliability and interrater agreement. Organizational Research Methods. 2008;11:815–52.
24. Koo TK, Li MYA. Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. Journal of Chiropractic Medicine. 2016; 15(2):155–63.
25. Biemann T, Cole MS. An Excel 2007 Tool for Computing Interrater Agreement (IRA) & Interrater Reliability (IRR) Estimates Version 1.5. 2014. 05.07.2018 tarihinde http://www.sbuweb.tcu.edu/mscole/docs/Tool%20for%20Computing%20IRA%20and%20IRR%20Estimates_v1.5.zip adresinden erişildi.
26. EÜTF Plan Grubu. Klinik öncesi dönem eğitimi programını yatay ve dikey entegrasyon zemininde yeniden yapılandırma çalışmaları rehberi ve 2002-2003 yılı çalışma takvimi İzmir; Ege Üniversitesi Tıp Fakültesi: 2002.
27. Aker S, Dündar C, Pekşen Y. Ölçme Araçlarında iki Yaşamsal Kavram: Geçerlik ve Güvenirlik: Derleme. Deneysel ve Klinik Tıp Dergisi. 2005; 22(1):50-60.
28. Çelen Ü. Klasik test kuramı ve madde tepki kuramı yöntemleriyle geliştirilen iki testin geçerlilik ve güvenilirliğinin karşılaştırılması. İlköğretim Online 2008; 7(3):758-68.
29. Güler N. Rasgele veriler üzerinde Genellenebilirlik Kuramı ve Klasik Test Kuramı’na göre güvenirliğin karşılaştırılması. Eğitim ve Bilim. 2011; 36(162):225-34.
30. Beckman TJ, Ghosh AK, Cook DA, Erwin PJ, & Mandrekar JN. How reliable are assessments of clinical teaching?. Journal of General Internal Medicine. 2004; 19(9): 971-7.
31. Kogan JR, and Shea JA. Course evaluation in medical education. Teaching and Teacher Education. 2007; 23(3):251-64.
32. Cohen R, MacRae H & Jamieson C. Teaching effectiveness of surgeons. The American Journal of Surgery. 1996; 171(6):612-4.
33. Kalender İ. Reliability-Related Issues in the Context of Student Evaluations of Teaching in Higher Education. International Journal of Higher Education. 2015; 4(3):44-56.
34. Solomon DJ, Speer AJ, Rosebraugh CJ, & DiPette DJ. The reliability of medical student ratings of clinical teaching. Evaluation & the health professions. 1997; 2(3):343-52.
35. Cook DA & Beckman TJ. Current concepts in validity and reliability for psychometric instruments: theory and application. The American journal of medicine 2006; 119(2): 166-e7-e16.
36. Rantanen P. The number of feedbacks needed for reliable evaluation: A multilevel analysis of the reliability, stability and generalisability of students’ evaluation of teaching. Assessment & Evaluation in Higher Education. 2013; 38(2): 224-39.

There are 36 citations in total.

Details

Primary Language	English
Subjects	Health Care Administration
Journal Section	Original Article
Authors	Kevser Vatansever 0000-0002-8943-9874 Şöhret Aydemir This is me 0000-0001-8354-9100 Hilal Batı This is me 0000-0002-8781-6816 Cenk Can 0000-0002-1992-5367 Mahmut Çoker This is me 0000-0001-6494-9539 Selda Erensoy Erensoy This is me 0000-0002-7052-8359 Figen Gövsa This is me 0000-0001-9635-6308 Özen Başoğlu This is me 0000-0001-8168-6611 Lütfiye Kanıt This is me 0000-0001-5160-411X Nilgün Kültürsay 0000-0003-0867-1514 Oktay Nazlı 0000-0002-3748-9941 Eser Sözmen This is me 0000-0002-6383-6724 Sıla Elif Törün This is me 0000-0001-9022-6457 Meltem Çiçeklioğlu 0000-0002-7059-7573
Publication Date	April 30, 2020
Submission Date	August 26, 2019
Published in Issue	Year 2020 Volume: 19 Issue: 57

Cite

Vancouver	Vatansever K, Aydemir Ş, Batı H, Can C, Çoker M, Erensoy SE, et al. Reliability and Validity of The Measurements Done by Using Students’ and Teachers’ Evaluation Forms in Ege University School of Medicine. Tıp Eğitimi Dünyası. 2020;19(57):37-54.

Article Files

Full Text