Research Article
BibTex RIS Cite

Standard Setting in Academic Writing Assessment through Objective Standard Setting Method

Year 2022, Volume: 9 Issue: 1, 80 - 97, 10.03.2022
https://doi.org/10.21449/ijate.1059304

Abstract

Performance standards have important consequences for all the stakeholders in the assessment of L2 academic writing. These standards not only describe the level of writing performance but also provide a basis for making evaluative decisions on the academic writing. Such a high-stakes role of the performance standards requires the enhancement of objectivity in standard setting procedure. Accordingly, this study aims to shed light upon the usefulness of Objective Standard Setting (OSS) method in specifying the levels of proficiency in L2 academic writing. On the basis of the descriptive research design, the sample of this research includes the examinees and raters who were student teachers at the university level. Essay task and analytical writing scoring rubric were employed as the data collection tools. In data analysis, OSS method and two-step cluster analysis were used. The analysis results of OSS method based on many-facet Rasch measurement model (MFRM) outline the distribution of the criteria into the levels of proficiency. Also, the main findings in OSS method were validated with two-step cluster analysis. That is, OSS method may be practically used to help the stakeholders make objective judgments on the examinees’ target performance.

References

  • Bejar, I.I. (2008). Standard setting: What is it? Why is it important? R&D Connections, 7, 1-6.
  • Best, J.W., & Khan, J.V. (2006). Research in Education (10th Edition). Pearson.
  • Bichi, A.A., Talib, R., Embong, R., Mohamed, H. B., Ismail, M. S., & Ibrahim, A. (2019). Rasch-based objective standard setting for university placement test. Eurasian Journal of Educational Research, 19(84), 57-70. https://doi.org/10.14689/ejer.2019.84.3
  • Chen, W.H., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265-289. https://doi.org/10.3102/10769986022003265
  • Cizek, G.J. (1993). Reconsidering standards and criteria. Journal of Educational Measurement, 30(2), 93-106. https://doi.org/10.1111/j.1745-3984.1993.tb01068.x
  • Cizek, G.J. (Ed.). (2012). An introduction to contemporary standard setting: concepts, characteristics, and concepts. In Setting performance standards: Concepts, methods, and perspectives. Mahwah, NJ: Lawrence Erlbaum Associates.
  • Council of Europe [CoE]. (2001). Common European framework of reference for languages: Learning, teaching, assessment. Cambridge, England: Cambridge University Press.
  • Cokluk, O., Sekercioglu, G., & Buyukozturk, S. (2012). Sosyal bilimler icin cok degiskenli istatistik: SPSS ve LISREL uygulamalari (2nd edition) [Multivariate statistics for social sciences: SPSS and LISREL applications], Pegem Akademi.
  • Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). Studies in language testing 7: Dictionary of language testing. Cambridge University Press.
  • Davis-Becker, S.L., Buckendahl, C.W., & Gerrow, J. (2011). Evaluating the bookmark standard setting method: The impact of random item ordering. International Journal of Testing, 11(1), 24-37. https://doi.org/10.1080/15305058.2010.501536
  • Elder, C., Barkhuizen, G., Knoch, U., & Von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writing assessment. Language Testing, 24(1), 37-64. https://doi.org/10.1177/0265532207071511
  • Erkus, A., Sunbul, O., Omur-Sunbul, S., Yormaz, S., & Asiret, S. (2017). Psikolojide olcme ve olcek gelistirme-II (1st edition) [Measurement in psychology and scale development-II], Pegem Akademi.
  • Fleckenstein, J., Keller, S., Krüger, M., Tannenbaum, R.J., & Köller, O. (2020). Linking TOEFL iBT® writing rubrics to CEFR levels: Cut scores and validity evidence from a standard setting study. Assessing Writing, 43, 1 15. https://doi.org/10.1016/j.asw.2019.100420
  • Fulcher, G. (2013). Practical language testing. Routledge. https://doi.org/10.4324/980203767399
  • Goodwin, S. (2016). A Many-Facet Rasch analysis comparing essay rater behavior on an academic English reading/writing test used for two purposes. Assessing Writing, 30, 21-31. https://doi.org/10.1016/j.asw.2016.07.004
  • Green, A. (2018). Linking tests of English for academic purposes to the CEFR: The score user’s perspective. Language Assessment Quarterly, 15(1), 59 74. https://doi.org/10.1080/15434303.2017.1350685
  • Harsch, C., & Rupp, A.A. (2011). Designing and scaling level-specific writing tasks in alignment with the CEFR: A test-centered approach. Language Assessment Quarterly, 8(1), 1-33. https://doi.org/10.1080/15434303.2010.535575
  • Hsieh, M. (2013). An application of multifaceted Rasch measurement in the Yes/No Angoff standard setting procedure. Language Testing, 30(4), 491 512. https://doi.org/10.1177/0265532213476259
  • IELTS (The Internatinal English Language Testing System). https://www.ielts.org/
  • Kayri, M. (2007). Two-step clustering analysis in researches: A case study. Eurasian Journal of Educational Research (EJER), 28, 89-99.
  • Khalid, M. N. (2011). Cluster analysis-a standard setting technique in measurement and testing. Journal of Applied Quantitative Methods, 6(2), 46-58.
  • Khatimin, N., Aziz, A.A., Zaharim, A., & Yasin, S.H.M. (2013). Development of objective standard setting using Rasch measurement model in Malaysian institution of higher learning. International Education Studies, 6(6), 151 160. https://doi.org/10.5539/ies.v6n6p151
  • Lawshe, C.H. (1975). A quantitative approach to content validity. Personnel Psychology, 28(4), 563-575. https://doi.org/10.1111/j.1744-6570.1975.tb01393.x
  • Linacre, J.M. (2017). A user’s guide to FACETS: Rasch-model computer programs. Chicago: MESA Press.
  • Livingston, S.A., & Zieky, M.J. (1982). Passing scores: A manual for setting standards of performance on educational and occupational tests. Educational Testing Service: New Jersey.
  • McDonald, R.P. (1999). Test theory: A unified approach. Mahwah, NJ: Erlbaum.
  • MacDougall, M., & Stone, G.E. (2015). Fortune-tellers or content specialists: Challenging the standard setting paradigm in medical education programmes. Journal of Contemporary Medical Education, 3(3), 135. https://doi.org/10.5455/jcme.20151019104847
  • Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing, 25(4), 465-493. https://doi.org/10.1177/0265532208094273
  • Shin, S.Y., & Lidster, R. (2017). Evaluating different standard-setting methods in an ESL placement testing context. Language Testing, 34(3), 357 381. https://doi.org/10.1177/0265532216646605
  • Sireci, S.G., Robin, F., & Patelis, T. (1997). Using cluster analysis to facilitate standard setting. Applied Measurement in Education, 12(3), 301 325. https://doi.org/10.1207/S15324818AME1203_5
  • Sondergeld, T.A., Stone, G.E., & Kruse, L.M. (2020). Objective standard setting in educational assessment and decision making. Educational Policy, 34(5), 735-759. https://doi.org/10.1177/0895904818802115
  • Stone, G.E. (2001). Objective standard setting (or truth in advertising). Journal of Applied Measurement, 2(2), 187-201.
  • Stone, G.E., Koskey, K.L., & Sondergeld, T.A. (2011). Comparing construct definition in the Angoff and Objective Standard Setting models: Playing in a house of cards without a full deck. Educational and Psychological Measurement, 71(6), 942 962. https://doi.org/10.1177/0013164410394338
  • Sata, M. & Karakaya, I. (2021). Investigating the effect of rater training on differential rater function in assessing academic writing skills of higher education students. Journal of Measurement and Evaluation in Education and Psychology, 12(2), 163 181. https://doi.org/10.21031/epod.842094
  • Tannenbaum, R.J., & Wylie, E.C. (2008). Linking English‐language test scores onto the common European framework of reference: An application of standard‐setting methodology. ETS Research Report Series, 2008(1), i-75. https://doi.org/10.1002/j.2333-8504.2008.tb02120.x
  • Trace, J., Janssen, G., & Meier, V. (2017). Measuring the impact of rater negotiation in writing performance assessment. Language Testing, 34(1), 3 22. https://doi.org/10.1177/0265532215594830
  • Violato, C., Marini, A., & Lee, C. (2003). A validity study of expert judgment procedures for setting cutoff scores on high-stakes credentialing examinations using cluster analysis. Evaluation & The Health Professions, 26(1), 59 72. https://doi.org/10.1177/0163278702250082
  • Weigle, S.C. (2002). Assessing writing. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511732997
  • Wilson, F.R., Pan, W., & Schumsky, D.A. (2012). Recalculation of the critical values for Lawshe’s content validity ratio. Measurement and Evaluation in Counseling and Development, 45(3), 197-210. https://doi.org/10.1177/0748175612440286
  • Wind, S.A., & Engelhard Jr, G. (2013). How invariant and accurate are domain ratings in writing assessment? Assessing Writing, 18(4), 278-299.
  • Wright, B.D., & Grosse M. (1993). How to set standards. Rasch Measurement Transactions, 7(3), 315-316.
  • Yudkowsky, R., Downing, S. M., & Tekian, A. (2009). Standard setting. In R. Yudkowsky & S. Downing (Ed.), Assessment in health professions education (pp. 86-105). Routledge. https://doi.org/10.4324/9781315166902-6

Standard Setting in Academic Writing Assessment through Objective Standard Setting Method

Year 2022, Volume: 9 Issue: 1, 80 - 97, 10.03.2022
https://doi.org/10.21449/ijate.1059304

Abstract

Performance standards have important consequences for all the stakeholders in the assessment of L2 academic writing. These standards not only describe the level of writing performance but also provide a basis for making evaluative decisions on the academic writing. Such a high-stakes role of the performance standards requires the enhancement of objectivity in standard setting procedure. Accordingly, this study aims to shed light upon the usefulness of Objective Standard Setting (OSS) method in specifying the levels of proficiency in L2 academic writing. On the basis of the descriptive research design, the sample of this research includes the examinees and raters who were student teachers at the university level. Essay task and analytical writing scoring rubric were employed as the data collection tools. In data analysis, OSS method and two-step cluster analysis were used. The analysis results of OSS method based on many-facet Rasch measurement model (MFRM) outline the distribution of the criteria into the levels of proficiency. Also, the main findings in OSS method were validated with two-step cluster analysis. That is, OSS method may be practically used to help the stakeholders make objective judgments on the examinees’ target performance.

References

  • Bejar, I.I. (2008). Standard setting: What is it? Why is it important? R&D Connections, 7, 1-6.
  • Best, J.W., & Khan, J.V. (2006). Research in Education (10th Edition). Pearson.
  • Bichi, A.A., Talib, R., Embong, R., Mohamed, H. B., Ismail, M. S., & Ibrahim, A. (2019). Rasch-based objective standard setting for university placement test. Eurasian Journal of Educational Research, 19(84), 57-70. https://doi.org/10.14689/ejer.2019.84.3
  • Chen, W.H., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22(3), 265-289. https://doi.org/10.3102/10769986022003265
  • Cizek, G.J. (1993). Reconsidering standards and criteria. Journal of Educational Measurement, 30(2), 93-106. https://doi.org/10.1111/j.1745-3984.1993.tb01068.x
  • Cizek, G.J. (Ed.). (2012). An introduction to contemporary standard setting: concepts, characteristics, and concepts. In Setting performance standards: Concepts, methods, and perspectives. Mahwah, NJ: Lawrence Erlbaum Associates.
  • Council of Europe [CoE]. (2001). Common European framework of reference for languages: Learning, teaching, assessment. Cambridge, England: Cambridge University Press.
  • Cokluk, O., Sekercioglu, G., & Buyukozturk, S. (2012). Sosyal bilimler icin cok degiskenli istatistik: SPSS ve LISREL uygulamalari (2nd edition) [Multivariate statistics for social sciences: SPSS and LISREL applications], Pegem Akademi.
  • Davies, A., Brown, A., Elder, C., Hill, K., Lumley, T., & McNamara, T. (1999). Studies in language testing 7: Dictionary of language testing. Cambridge University Press.
  • Davis-Becker, S.L., Buckendahl, C.W., & Gerrow, J. (2011). Evaluating the bookmark standard setting method: The impact of random item ordering. International Journal of Testing, 11(1), 24-37. https://doi.org/10.1080/15305058.2010.501536
  • Elder, C., Barkhuizen, G., Knoch, U., & Von Randow, J. (2007). Evaluating rater responses to an online training program for L2 writing assessment. Language Testing, 24(1), 37-64. https://doi.org/10.1177/0265532207071511
  • Erkus, A., Sunbul, O., Omur-Sunbul, S., Yormaz, S., & Asiret, S. (2017). Psikolojide olcme ve olcek gelistirme-II (1st edition) [Measurement in psychology and scale development-II], Pegem Akademi.
  • Fleckenstein, J., Keller, S., Krüger, M., Tannenbaum, R.J., & Köller, O. (2020). Linking TOEFL iBT® writing rubrics to CEFR levels: Cut scores and validity evidence from a standard setting study. Assessing Writing, 43, 1 15. https://doi.org/10.1016/j.asw.2019.100420
  • Fulcher, G. (2013). Practical language testing. Routledge. https://doi.org/10.4324/980203767399
  • Goodwin, S. (2016). A Many-Facet Rasch analysis comparing essay rater behavior on an academic English reading/writing test used for two purposes. Assessing Writing, 30, 21-31. https://doi.org/10.1016/j.asw.2016.07.004
  • Green, A. (2018). Linking tests of English for academic purposes to the CEFR: The score user’s perspective. Language Assessment Quarterly, 15(1), 59 74. https://doi.org/10.1080/15434303.2017.1350685
  • Harsch, C., & Rupp, A.A. (2011). Designing and scaling level-specific writing tasks in alignment with the CEFR: A test-centered approach. Language Assessment Quarterly, 8(1), 1-33. https://doi.org/10.1080/15434303.2010.535575
  • Hsieh, M. (2013). An application of multifaceted Rasch measurement in the Yes/No Angoff standard setting procedure. Language Testing, 30(4), 491 512. https://doi.org/10.1177/0265532213476259
  • IELTS (The Internatinal English Language Testing System). https://www.ielts.org/
  • Kayri, M. (2007). Two-step clustering analysis in researches: A case study. Eurasian Journal of Educational Research (EJER), 28, 89-99.
  • Khalid, M. N. (2011). Cluster analysis-a standard setting technique in measurement and testing. Journal of Applied Quantitative Methods, 6(2), 46-58.
  • Khatimin, N., Aziz, A.A., Zaharim, A., & Yasin, S.H.M. (2013). Development of objective standard setting using Rasch measurement model in Malaysian institution of higher learning. International Education Studies, 6(6), 151 160. https://doi.org/10.5539/ies.v6n6p151
  • Lawshe, C.H. (1975). A quantitative approach to content validity. Personnel Psychology, 28(4), 563-575. https://doi.org/10.1111/j.1744-6570.1975.tb01393.x
  • Linacre, J.M. (2017). A user’s guide to FACETS: Rasch-model computer programs. Chicago: MESA Press.
  • Livingston, S.A., & Zieky, M.J. (1982). Passing scores: A manual for setting standards of performance on educational and occupational tests. Educational Testing Service: New Jersey.
  • McDonald, R.P. (1999). Test theory: A unified approach. Mahwah, NJ: Erlbaum.
  • MacDougall, M., & Stone, G.E. (2015). Fortune-tellers or content specialists: Challenging the standard setting paradigm in medical education programmes. Journal of Contemporary Medical Education, 3(3), 135. https://doi.org/10.5455/jcme.20151019104847
  • Schaefer, E. (2008). Rater bias patterns in an EFL writing assessment. Language Testing, 25(4), 465-493. https://doi.org/10.1177/0265532208094273
  • Shin, S.Y., & Lidster, R. (2017). Evaluating different standard-setting methods in an ESL placement testing context. Language Testing, 34(3), 357 381. https://doi.org/10.1177/0265532216646605
  • Sireci, S.G., Robin, F., & Patelis, T. (1997). Using cluster analysis to facilitate standard setting. Applied Measurement in Education, 12(3), 301 325. https://doi.org/10.1207/S15324818AME1203_5
  • Sondergeld, T.A., Stone, G.E., & Kruse, L.M. (2020). Objective standard setting in educational assessment and decision making. Educational Policy, 34(5), 735-759. https://doi.org/10.1177/0895904818802115
  • Stone, G.E. (2001). Objective standard setting (or truth in advertising). Journal of Applied Measurement, 2(2), 187-201.
  • Stone, G.E., Koskey, K.L., & Sondergeld, T.A. (2011). Comparing construct definition in the Angoff and Objective Standard Setting models: Playing in a house of cards without a full deck. Educational and Psychological Measurement, 71(6), 942 962. https://doi.org/10.1177/0013164410394338
  • Sata, M. & Karakaya, I. (2021). Investigating the effect of rater training on differential rater function in assessing academic writing skills of higher education students. Journal of Measurement and Evaluation in Education and Psychology, 12(2), 163 181. https://doi.org/10.21031/epod.842094
  • Tannenbaum, R.J., & Wylie, E.C. (2008). Linking English‐language test scores onto the common European framework of reference: An application of standard‐setting methodology. ETS Research Report Series, 2008(1), i-75. https://doi.org/10.1002/j.2333-8504.2008.tb02120.x
  • Trace, J., Janssen, G., & Meier, V. (2017). Measuring the impact of rater negotiation in writing performance assessment. Language Testing, 34(1), 3 22. https://doi.org/10.1177/0265532215594830
  • Violato, C., Marini, A., & Lee, C. (2003). A validity study of expert judgment procedures for setting cutoff scores on high-stakes credentialing examinations using cluster analysis. Evaluation & The Health Professions, 26(1), 59 72. https://doi.org/10.1177/0163278702250082
  • Weigle, S.C. (2002). Assessing writing. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511732997
  • Wilson, F.R., Pan, W., & Schumsky, D.A. (2012). Recalculation of the critical values for Lawshe’s content validity ratio. Measurement and Evaluation in Counseling and Development, 45(3), 197-210. https://doi.org/10.1177/0748175612440286
  • Wind, S.A., & Engelhard Jr, G. (2013). How invariant and accurate are domain ratings in writing assessment? Assessing Writing, 18(4), 278-299.
  • Wright, B.D., & Grosse M. (1993). How to set standards. Rasch Measurement Transactions, 7(3), 315-316.
  • Yudkowsky, R., Downing, S. M., & Tekian, A. (2009). Standard setting. In R. Yudkowsky & S. Downing (Ed.), Assessment in health professions education (pp. 86-105). Routledge. https://doi.org/10.4324/9781315166902-6
There are 42 citations in total.

Details

Primary Language English
Subjects Other Fields of Education
Journal Section Articles
Authors

Fatima Nur Fişne This is me 0000-0001-9224-2485

Mehmet Sata 0000-0003-2683-4997

Ismail Karakaya This is me 0000-0003-4308-6919

Publication Date March 10, 2022
Submission Date March 19, 2021
Published in Issue Year 2022 Volume: 9 Issue: 1

Cite

APA Fişne, F. N., Sata, M., & Karakaya, I. (2022). Standard Setting in Academic Writing Assessment through Objective Standard Setting Method. International Journal of Assessment Tools in Education, 9(1), 80-97. https://doi.org/10.21449/ijate.1059304

23824         23823             23825