Research Article

Investigating the quality of a high-stakes EFL writing assessment procedure in the Turkish higher education context

Volume: 11 Number: 4 November 15, 2024
TR EN

Investigating the quality of a high-stakes EFL writing assessment procedure in the Turkish higher education context

Abstract

Employing G-theory and rater interviews, the study investigated how a high-stakes writing assessment procedure (i.e., a single-task, single-rater, and holistic scoring procedure) impacted the variability and reliability of its scores within the Turkish higher education context. Thirty-two essays written on two different writing tasks (i.e., narrative and opinion) by 16 EFL students studying at a Turkish state university were scored by 10 instructor raters both holistically and analytically. After the raters completed the scoring procedure, semi-structured individual interviews were held with them to gain insight into their views regarding the quality of the current scoring procedure. The G-theory results showed that the reliability coefficients obtained from the current scoring procedure would not be sufficient to draw sound conclusions. The quantitative results were partly supported by the qualitative data. Important implications were discussed to improve the quality of the current high-stakes EFL writing assessment policy.

Keywords

Supporting Institution

The study has not been supported by any institutions.

References

  1. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
  2. Attali, Y. (2020). Effect of Immediate Elaborated Feedback on Rater Accuracy. ETS Research Report Series, 2020(1), 1-15.
  3. Bachman, L.F. (1990). Fundamental considerations in language testing. Oxford University Press.
  4. Barkaoui, K. (2007). Rating scale impact on EFL essay marking: A mixed-method study. Assessing Writing, 12(2), 86-107. https://doi.org/10.1016/j.asw.2007.07.001
  5. Barkaoui, K. (2008). Effects of scoring method and rater experience on ESL essay rating processes and outcomes [Unpublished doctoral dissertation, University of Toronto, Canada].
  6. Barkaoui, K. (2010). Do ESL essays raters’ evaluation criteria change with experience? A mixed-methods, cross-sectional study. TESOL Quarterly, 44(1), 31-57.
  7. Brennan, R.L. (2001). Generalizability theory: Statistics for social science and public policy. Springer Verlag. Retrieved from https://www.google.com.tr/search?hl=tr&tbo=p&tbm=bks&q=isbn:0387952829
  8. Briesch, A.M., Swaminathan, H., Welsh, M., & Chafouleas, S.M. (2014). Generalizability theory: A practical guide to study design, implementation, and interpretation. Journal of Psychology, 52(1), 13-15. http://dx.doi.org/10.1016/j.jsp.2013.11.008

Details

Primary Language

English

Subjects

Measurement and Evaluation in Education (Other)

Journal Section

Research Article

Early Pub Date

October 21, 2024

Publication Date

November 15, 2024

Submission Date

November 1, 2023

Acceptance Date

August 26, 2024

Published in Issue

Year 2024 Volume: 11 Number: 4

APA
Sarı, E. (2024). Investigating the quality of a high-stakes EFL writing assessment procedure in the Turkish higher education context. International Journal of Assessment Tools in Education, 11(4), 660-674. https://doi.org/10.21449/ijate.1384824

23823             23825             23824