Araştırma Makalesi
PDF Mendeley EndNote BibTex Kaynak Göster

Yıl 2021, Cilt 8, Sayı 4, 239 - 252, 01.12.2021
https://doi.org/10.17275/per.21.88.8.4

Öz

Kaynakça

  • Aslanoğlu, A. E. ve Kutlu, Ö. (2003). Öğretimde sunu becerilerinin değerlendirilmesinde dereceli puanlama anahtarı (rubric) kullanılmasına ilişkin bir araştırma [Research on rubric in evaluating the presentation skills in education]. Ankara Üniversitesi Eğitim Bilimleri Fakültesi Dergisi [Ankara University Faculty of Educational Sciences Journal], 36(1-2), 25-36.
  • Bond, T. G., & Fox, C. M. (2015). Applying the rasch model: Fundamental measurement in the human sciences (3rd ed.). New York: Routledge.
  • Collins, J. L. (2000). Review of key concepts in strategic reading and writing instruction. J. L. Collins (Ed.), in Cheektowaga-sloan handbook of practical reading and writing strategies (pp. 5-10). Retrieved from http://gse.buffalo.edu/org/writingstrategies/PDFFiles/CHEEKTOWAGA-SLOAN.PDF
  • Du, Y.,Wright, B. D., & Brown, W. L. (1996, April). Differential facet functioning detection in direct writing assessment. Paper presented at the Annual Meeting of the American Educational Research Association, New York.
  • Eckes, T. (2005). Examining rater effects in test of writing and speaking performance assessments: A many-facet rasch analysis. Language Assessment Quarterly, 2(3), 197-221. https://doi.org/10.1207/s15434311laq0203_2
  • Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25(2), 155-185. https://doi.org/10.1177/0265532207086780
  • Eckes, T. (2019). Many-Facet Rasch measurement: Implications for rater-mediated language assesment. In Quantitative Data Analysis for Language Assessment (1st ed.) (pp.153-175). UK: Routledge.
  • Englert, C. S., & Mariage, T. (2003). The sociocultural model in special education interventions: Apprenticing students in higher-order thinking. In L. H. Swanson, K. Harris, & S. Graham (Eds.), Handbook of Learning Disabilities (pp. 450-467). New York: Guilford.
  • Erhardt, R. P., & Meade, V. (2005). Improving handwriting without teaching handwriting: The consultative clinical reasoning process. Australian Occupational Therapy Journal, 52(3), 199-210. https://doi.org/10.1111/j.1440-1630.2005.00505.x Engelhard, G. Jr. (1994). Examining rater errors in the assessment of written composition with a many-faceted Rasch model. Journal of Educational Measurement, 31(2), 93- 112.
  • Engelhard, G., &Myford, C. M. (2003). Monitoring faculty consultant performance in the advanced placement English Literature and composition program with a many‐faceted Rasch model. ETS Research Report Series (1), i-60.
  • Engelhard, G., Jr. (2007). Differential rater functioning. Rasch Measurement Transactions, 21, 1124-1125.
  • Englert, C. S., Raphael, T. E., Anderson Helene M., Anthony, L. M., & Stevens, D. D. (1991). Making strategies and self-talk visible: Writing instruction in regular and special education classrooms. American Educational Research Journal, 28(2), 337–372. https://doi.org/10.3102/00028312028002337
  • Farrokhi, F., Esfandiari, R., & Vaez Dalili, M. (2011). Applying the many-facet Rasch model to detect centrality in self-assessment, peer-assessment and teacher assessment. World Applied Sciences Journal, 15, 70-77.
  • Feldman, M., Lazzara, E. H., Vanderbilt, A. A., & DiazGranados, D. (2012). Rater training to support high‐stakes simulation‐based assessments. Journal of Continuing Education in the Health Professions, 32(4), 279-286. https://doi.org/10.1002/chp.21156
  • Goodrich, H. (1997). Understanding rubrics. Educational Leadership, 54(4), 14-17.
  • Goodwin, L. D. (2001). Interrater agreement and reliability. Measurement in Physical Education and Exercise Science, 5(1), 13-34.
  • Gyagenda, I. S., & Engelhard, G. (2009). Using classical and modern measurement theories to explore rater, domain, and gender influences on student writing ability. Journal of Applied Measurement, 10(3), 225-246.
  • Haiyang, S. (2010). An application of classical test theory and many facet Rasch measurement in analyzing the reliability of an English test for non-English majör graduates. Chinese Journal of Applied Linguistics, 33(2), 87-102.
  • Hauenstein, N. M., & McCusker, M. E. (2017). Rater training: Understanding effects of training content, practice ratings, and feedback. International Journal of Selection and Assessment, 25(3), 253-266. https://doi.org/10.1111/ijsa.12177
  • Hoyt, W. T. (2000). Rater bias in psychological research: When is it a problem and what can we do about it? Psychological Methods, 5(1), 64. http://dx.doi.org/10.1037/1082- 989X.5.1.64
  • Johnson, J. S., & Lim, G. S. (2009). The influence of rater language background on writing performance assessment. Language Testing, 26(4), 485-505. https://doi.org/10.1177/0265532209340186
  • Kondo-Brown, K. (2002). A FACETS analysis of rater bias in measuring Japanese second language writing performance. Language Testing, 19(1), 3-31. https://doi.org/10.1191/0265532202lt218oa
  • Kuan-Yu Jin & Wen-Chung Wang (2017). Assessment of Differential Rater Functioning in Latent Classes with New Mixture Facets Models. Multivariate Behavioral Research, 52(3), 391-402. https://doi.org/10.1080/00273171.2017.1299615
  • Lam, S. S. T., Au, R. K. C., Leung, H. W. H. & Li Tsang, C. W. P. (2011). Chinese handwriting performance of primary school children with dyslexia. Research in Developmental Disabilities, 32, 1745-1756. https://doi.org/10.1016/j.ridd.2011.03.001
  • Li Tsang, C. W. P., Au, R. K. C., Chan, M. H. Y., Chan, L. W. L., Lau, G. M. T., Lo, T. K. & Leung, H. W. H. (2011). Handwriting characteristics among secondary students with and without physical disabilities: A study with a computerized tool. Research in Developmental Disabilities, 32, 207-216. https://doi.org/10.1016/j.ridd.2010.09.015
  • Linacre, J.M. (2018). A user's guide to FACETS Rasch-model computer programs. Program manual 3.81. 0. Chicago: MESA Press.
  • Marzano, R. J. (2001). Designing a new taxonomy of educational objectives. Experts in assesment. Thousand Oaks, CA: Corwin Press, Inc.
  • McDonald, M. B. (1999). Seed Deterioration: Physiology, Repair and Assessment. Seed Science and Technology, 27(1), 177-237. Retrieved from https://ci.nii.ac.jp/naid/10025267238/
  • McNamara, T. (1996). Measuring second language performance. New York: Longman.
  • Osburn, H.G. (2000). Coefficient alpha and related internal consistency reliability coefficients. Psychological Methods, 5(3), 343-355. https://doi.org/10.1037/1082-989X.5.3.343
  • Sata, M. (2019). Performans degerlendirme surecinde puanlayici egitiminin puanlayici davranislari uzerindeki etkisinin incelenmesi [The investigation of the effect of rater training on the rater behaviors in the performance assessment process]. Unpublished doctoral dissertation. Gazi University, Ankara.
  • Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing, 25(4), 465-493. https://doi.org/10.1177/0265532208094273
  • Shi, L. (2001). Native- and nonnative-speaking EFL teachers' evaluation of Chinese students' English writing. Language Testing, 18, 303-325. https://doi.org/10.1191/026553201680188988
  • Tamanini, K. B. (2008). Evaluating differential rater functioning in performance ratings: Using a goal-based approach. Unpublished doctoral dissertation. Ohio University, Ohio.
  • Wesolowski, B. C., Wind, S. A., & Engelhard, G. (2015). Rater fairness in music performance assessment: Evaluating model-data fit and differential rater functioning. Musicae Scientiae, 19(2), 147 -170. https://doi.org/10.1177/1029864915589014
  • Wolfe, E. W., & McVay, A. (2012). Application of Latent Trait Models to Identifying Substantively Interesting Raters. Educational Measurement: Issues and Practice, 31(3), 31-37. https://doi.org/10.1111/j.1745-3992.2012.00241.x

Examining the Differential Rater Functioning in the Process of Assessing Writing Skills of Middle School 7th Grade Students

Yıl 2021, Cilt 8, Sayı 4, 239 - 252, 01.12.2021
https://doi.org/10.17275/per.21.88.8.4

Öz

When students present writing tasks that require higher order thinking skills to work, one of the most important problems is scoring these writing tasks objectively. The fact that raters give scores below or above their performance based on several environmental factors affects the consistency of the measurements. Inconsistencies in scoring negatively affect the validity and reliability of student performance and cause the scores obtained to be questioned. In regard to the validity and reliability of these measurements, it is significant to identify the rater behavior and correct the sources of error. This study aims to analyze the differential rater functioning (DRF), which is one of the problematic rater behaviors, in evaluating compositions written by middle school 7th-grade students within the scope of the Turkish course. 86 students attending a public school were participated the study. Students' compositions were rated using an analytical rubric by 8 teachers from different institutions. In this correlational research, the many facet Rasch model was used, and five variables including students, raters’ and, students’ gender, students’ qualification, and evaluation criteria were examined. it was examined whether the raters show DRF on an individual and group basis based on the dual interaction analysis, including the gender of the student x rater and the student's competence x rater. The findings have revealed that DRF at the group level does not interfere with the measurements, while the individual level DRF is involved in the measurements. It was determined that the level of DRF mixing in the measurements of successful students was the lowest. Especially rigid and lenient raters were found to show DRF. In the present study, it was observed that the raters showing DRF was also the most lenient raters, while these raters did not show DRF in terms of the gender of the student.

Kaynakça

  • Aslanoğlu, A. E. ve Kutlu, Ö. (2003). Öğretimde sunu becerilerinin değerlendirilmesinde dereceli puanlama anahtarı (rubric) kullanılmasına ilişkin bir araştırma [Research on rubric in evaluating the presentation skills in education]. Ankara Üniversitesi Eğitim Bilimleri Fakültesi Dergisi [Ankara University Faculty of Educational Sciences Journal], 36(1-2), 25-36.
  • Bond, T. G., & Fox, C. M. (2015). Applying the rasch model: Fundamental measurement in the human sciences (3rd ed.). New York: Routledge.
  • Collins, J. L. (2000). Review of key concepts in strategic reading and writing instruction. J. L. Collins (Ed.), in Cheektowaga-sloan handbook of practical reading and writing strategies (pp. 5-10). Retrieved from http://gse.buffalo.edu/org/writingstrategies/PDFFiles/CHEEKTOWAGA-SLOAN.PDF
  • Du, Y.,Wright, B. D., & Brown, W. L. (1996, April). Differential facet functioning detection in direct writing assessment. Paper presented at the Annual Meeting of the American Educational Research Association, New York.
  • Eckes, T. (2005). Examining rater effects in test of writing and speaking performance assessments: A many-facet rasch analysis. Language Assessment Quarterly, 2(3), 197-221. https://doi.org/10.1207/s15434311laq0203_2
  • Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25(2), 155-185. https://doi.org/10.1177/0265532207086780
  • Eckes, T. (2019). Many-Facet Rasch measurement: Implications for rater-mediated language assesment. In Quantitative Data Analysis for Language Assessment (1st ed.) (pp.153-175). UK: Routledge.
  • Englert, C. S., & Mariage, T. (2003). The sociocultural model in special education interventions: Apprenticing students in higher-order thinking. In L. H. Swanson, K. Harris, & S. Graham (Eds.), Handbook of Learning Disabilities (pp. 450-467). New York: Guilford.
  • Erhardt, R. P., & Meade, V. (2005). Improving handwriting without teaching handwriting: The consultative clinical reasoning process. Australian Occupational Therapy Journal, 52(3), 199-210. https://doi.org/10.1111/j.1440-1630.2005.00505.x Engelhard, G. Jr. (1994). Examining rater errors in the assessment of written composition with a many-faceted Rasch model. Journal of Educational Measurement, 31(2), 93- 112.
  • Engelhard, G., &Myford, C. M. (2003). Monitoring faculty consultant performance in the advanced placement English Literature and composition program with a many‐faceted Rasch model. ETS Research Report Series (1), i-60.
  • Engelhard, G., Jr. (2007). Differential rater functioning. Rasch Measurement Transactions, 21, 1124-1125.
  • Englert, C. S., Raphael, T. E., Anderson Helene M., Anthony, L. M., & Stevens, D. D. (1991). Making strategies and self-talk visible: Writing instruction in regular and special education classrooms. American Educational Research Journal, 28(2), 337–372. https://doi.org/10.3102/00028312028002337
  • Farrokhi, F., Esfandiari, R., & Vaez Dalili, M. (2011). Applying the many-facet Rasch model to detect centrality in self-assessment, peer-assessment and teacher assessment. World Applied Sciences Journal, 15, 70-77.
  • Feldman, M., Lazzara, E. H., Vanderbilt, A. A., & DiazGranados, D. (2012). Rater training to support high‐stakes simulation‐based assessments. Journal of Continuing Education in the Health Professions, 32(4), 279-286. https://doi.org/10.1002/chp.21156
  • Goodrich, H. (1997). Understanding rubrics. Educational Leadership, 54(4), 14-17.
  • Goodwin, L. D. (2001). Interrater agreement and reliability. Measurement in Physical Education and Exercise Science, 5(1), 13-34.
  • Gyagenda, I. S., & Engelhard, G. (2009). Using classical and modern measurement theories to explore rater, domain, and gender influences on student writing ability. Journal of Applied Measurement, 10(3), 225-246.
  • Haiyang, S. (2010). An application of classical test theory and many facet Rasch measurement in analyzing the reliability of an English test for non-English majör graduates. Chinese Journal of Applied Linguistics, 33(2), 87-102.
  • Hauenstein, N. M., & McCusker, M. E. (2017). Rater training: Understanding effects of training content, practice ratings, and feedback. International Journal of Selection and Assessment, 25(3), 253-266. https://doi.org/10.1111/ijsa.12177
  • Hoyt, W. T. (2000). Rater bias in psychological research: When is it a problem and what can we do about it? Psychological Methods, 5(1), 64. http://dx.doi.org/10.1037/1082- 989X.5.1.64
  • Johnson, J. S., & Lim, G. S. (2009). The influence of rater language background on writing performance assessment. Language Testing, 26(4), 485-505. https://doi.org/10.1177/0265532209340186
  • Kondo-Brown, K. (2002). A FACETS analysis of rater bias in measuring Japanese second language writing performance. Language Testing, 19(1), 3-31. https://doi.org/10.1191/0265532202lt218oa
  • Kuan-Yu Jin & Wen-Chung Wang (2017). Assessment of Differential Rater Functioning in Latent Classes with New Mixture Facets Models. Multivariate Behavioral Research, 52(3), 391-402. https://doi.org/10.1080/00273171.2017.1299615
  • Lam, S. S. T., Au, R. K. C., Leung, H. W. H. & Li Tsang, C. W. P. (2011). Chinese handwriting performance of primary school children with dyslexia. Research in Developmental Disabilities, 32, 1745-1756. https://doi.org/10.1016/j.ridd.2011.03.001
  • Li Tsang, C. W. P., Au, R. K. C., Chan, M. H. Y., Chan, L. W. L., Lau, G. M. T., Lo, T. K. & Leung, H. W. H. (2011). Handwriting characteristics among secondary students with and without physical disabilities: A study with a computerized tool. Research in Developmental Disabilities, 32, 207-216. https://doi.org/10.1016/j.ridd.2010.09.015
  • Linacre, J.M. (2018). A user's guide to FACETS Rasch-model computer programs. Program manual 3.81. 0. Chicago: MESA Press.
  • Marzano, R. J. (2001). Designing a new taxonomy of educational objectives. Experts in assesment. Thousand Oaks, CA: Corwin Press, Inc.
  • McDonald, M. B. (1999). Seed Deterioration: Physiology, Repair and Assessment. Seed Science and Technology, 27(1), 177-237. Retrieved from https://ci.nii.ac.jp/naid/10025267238/
  • McNamara, T. (1996). Measuring second language performance. New York: Longman.
  • Osburn, H.G. (2000). Coefficient alpha and related internal consistency reliability coefficients. Psychological Methods, 5(3), 343-355. https://doi.org/10.1037/1082-989X.5.3.343
  • Sata, M. (2019). Performans degerlendirme surecinde puanlayici egitiminin puanlayici davranislari uzerindeki etkisinin incelenmesi [The investigation of the effect of rater training on the rater behaviors in the performance assessment process]. Unpublished doctoral dissertation. Gazi University, Ankara.
  • Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing, 25(4), 465-493. https://doi.org/10.1177/0265532208094273
  • Shi, L. (2001). Native- and nonnative-speaking EFL teachers' evaluation of Chinese students' English writing. Language Testing, 18, 303-325. https://doi.org/10.1191/026553201680188988
  • Tamanini, K. B. (2008). Evaluating differential rater functioning in performance ratings: Using a goal-based approach. Unpublished doctoral dissertation. Ohio University, Ohio.
  • Wesolowski, B. C., Wind, S. A., & Engelhard, G. (2015). Rater fairness in music performance assessment: Evaluating model-data fit and differential rater functioning. Musicae Scientiae, 19(2), 147 -170. https://doi.org/10.1177/1029864915589014
  • Wolfe, E. W., & McVay, A. (2012). Application of Latent Trait Models to Identifying Substantively Interesting Raters. Educational Measurement: Issues and Practice, 31(3), 31-37. https://doi.org/10.1111/j.1745-3992.2012.00241.x

Ayrıntılar

Birincil Dil İngilizce
Konular Eğitim, Bilimsel Disiplinler
Bölüm Research Articles
Yazarlar

Aslıhan ERMAN ASLANOĞLU (Sorumlu Yazar)
UFUK ÜNİVERSİTESİ, EĞİTİM FAKÜLTESİ
0000-0002-1364-7386
Türkiye


Mehmet ŞATA
AĞRI İBRAHİM ÇEÇEN ÜNİVERSİTESİ, EĞİTİM FAKÜLTESİ
0000-0003-2683-4997
Türkiye

Yayımlanma Tarihi 1 Aralık 2021
Yayınlandığı Sayı Yıl 2021, Cilt 8, Sayı 4

Kaynak Göster

APA Erman Aslanoğlu, A. & Şata, M. (2021). Examining the Differential Rater Functioning in the Process of Assessing Writing Skills of Middle School 7th Grade Students . Participatory Educational Research , 8 (4) , 239-252 . DOI: 10.17275/per.21.88.8.4