Comparing Fully Crossed and Nested Designs Where Items Nested in Raters in Generalizability Theory
Abstract
This study aims to investigate the comparison of G and Phi coefficients calculated both
by a fully crossed (bxpxm) design (b: person; p: rater; m:item) and a nested (bx(m:p)) design
in which items are nested in raters but individuals are crossed with items and raters in the
process of grading English Composition Writing Skill. The study consists of students who
attend a private university and 3 raters. According to the results, G and Phi coefficients of the
fully crossed design were higher. In terms of sources of variance, while person has the highest
percentage in total variance in fully crossed design, occasion has relatively low percentage.
Findings indicate that fully crossed design yields more reliable results in classroom practices.
In this respect, fully crossed design is recommended for classroom practices.
Keywords
Interater Reliability,Generalizability Theory,Fully crossed designs,Nested designs.
References
- Akın, Ö.,&Baştürk, R. (2010). Assessment of research assignment by many-facet Rasch measurement approach. Journal of Measurement and Evaluation in Education and Psychology, 1(1), 51–57.
- Akın, Ö.,&Baştürk, R. (2012). The evaluation of the basic skills in violin training by many-facetrasch model. Pamukkale Üniversity Journal of Education, 31, 175–187.
- Arter, J. A. Ve Mctighe, J. (2001). Scoring Rubrics in TheClassroom:Using Peformance Criteria for Assessing and Improving Student Performance, Thousand Oaks, CA: Corvin Press
- Büyükkıdık, S., & Anıl, D. (2015). Investigation of reliability in generalizability theory with different designs on performance based assessment. Education and Science, 40(177), 285–296.
- Doğan, C. D., &Yosmaoğlu, B. (2015). The effect of the analytical rubrics on the objectivity in physiotherapy practical examination.TurkiyeKlinikleri Journal of Sports Science, 7(1), 9–15.
- Engelhard, G. (1994). Examining rater errors in the assessment of written composition with a many‐faceted Rasch model. Journal of Educational Measurement, 31(2), 93–112.
- Engelhard, G., &Myford, C. M. (2003). Monitoring faculty consultant performance in the advanced placement English Literature and composition program with a many‐faceted Rasch model. ETS Research Report Series (1), i-60.
- Goodrich, H. (1996).Students Self Assessment: At theintersection of metacognition and authentic assessment. Doctoraldisertation. Cambrdige, MA: HarwardUniversity
- Güler, N., &Gelbal, S. (2010a). Studying reliability of open-ended mathematics items according to the classical test theory and generalizability. Educational Sciences: Theory & Practice, 10(2), 989–1019.
- Güler, N., &Gelbal, S. (2010b). A study based on the classical test theory and many facet Rasch model. Egitim Eurasian Journal of Educational Research, 38, 108–125.