Araştırma Makalesi
BibTex RIS Kaynak Göster

Comparison of Passing Scores Determined by The Angoff Method in Different Item Samples

Yıl 2020, , 80 - 97, 01.04.2020
https://doi.org/10.21449/ijate.699479

Öz

In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling (SRS), content stratified (C-SRS), item-difficulty stratified (D-SRS) and content-by-difficulty random sampling (CD-SRS) methods were used to constitute different length of subsets (30%, 40%, 50%, 70%) from the full-test. In total, 16 different study conditions (4 methods x 4 subsets) were investigated. In data analysis part, ANOVA analysis was conducted to examine whether minimum passing scores (MPSs) for the subsets were significantly different from the MPSs of the full-length test. As a follow-up analysis, RMSE and SEE (Standard Error of Estimation) values were calculated for each study condition. Results indicated that the estimated Angoff MPSs were significantly different from the full-test Angoff MPS (45.12) only in the study conditions of 30%-C-SRS, 40% C-SRS, 30% D-SRS and 30%-CD-SRS. According to RMSE values, the C-SRS method had the smallest error while the SRS method had the biggest one. Moreover, SEE examinations revealed that to achieve estimations similar to the full-test Angoff MPS (within one SEE), it is sufficient to get 50% of items with the C-SRS method. C-SRS method was the more effective one compared to the others in reducing the number of items rated by judges in MPS setting studies conducted with the Angoff method.

Kaynakça

  • Behuniak, P., Gable, R. K., &Archambault, F. X. (1982). The validity of categorized proficiency test scores. Educational and Psychological Measurement, 42, 247-252.
  • Berk, R. A. (1996). Standard setting: The next generation (where few pschometricians have gone before!). Applied Measurement in Education, 9, 215–235.
  • Buckendahl, C. W., Ferdous, A. A. & Gerrow, J. (2010). Recommending cut scores with a subset of items: An empirical illustration. Practical Assessment, 15(6), 1-10.
  • Cizek, G. J. (2001). Setting performance standards: Concepts, methods and perspectives. Mahwah, NJ: Lawrence Erlbaum Associates.
  • Çetin, S. (2011). İşaretleme ve angoff standart belirleme yöntemlerinin karşılaştırılması [Comparison of Bookmark and Angoff Standard Setting Methods]. PhD dissertation, Hacettepe University, Ankara.
  • Downing, S. M. (2006). Selected-Response item formats in test development. In T. M. Haladyna & S. M. Downing (Ed.), Handbook of test development (pp. 287-300). Mahwah, New Jersey: Routledge.
  • Ferdous, A. A., & Plake, B. S. (2005). The use of subsets of test questions in an Angoff standard setting method. Educational and Psychological Measurement, 65(2), 185-201.
  • Ferdous, A. A., & Plake, B. S. (2007). Item selection strategy for reducing the number of items rated in an Angoff standard setting study. Educational and Psychological Measurement, 67(2), 193-206.
  • Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to design and evaluate research in education (8th ed.). New York: McGraw Hill.
  • Hambleton, R. M. (1998). Setting performance standards on achievement tests: Meeting the requirements of Title I. In L. N. Hansche (Ed.), Handbook for the development of performance standards (pp. 87-114). Washington, DC: Council of Chief State School Officers.
  • Hambleton, R. K., & Pitoniak, M. (2006). Setting performance standards. In R. L. Brennan (Ed.), Educational Measurement (pp. 433–470). Westport, CT: Praeger.
  • Impara, J. C., & Plake, B. S. (1997). Standard setting: An alternative approach. Journal of Educational Measurement, 34, 353-366.
  • Irwin, P. (2007). An alternative examinee-centered standard setting strategy (Doctoral dissertation). University of Nebraska, USA.
  • Jaeger, R. M. (1989). Certification of student competence. In R. L. Linn (Ed.), Educational measurement (pp. 485-514). New York: American Council on Education/Macmillan.
  • Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational Measurement, 38(4), 319-342.
  • Kannan, P., Sgammato, A., & Tannenbaum, R. J. (2015). Evaluating the operational feasibility of using subsets of items to recommend minimal competency cut scores. Applied Measurement in Education, 28(4), 292-307.
  • Kannan, P., Sgammato, A., Tannenbaum, R. J., & Katz, I. R. (2015). Evaluating the consistency of angoff-based cut scores using subsets of items within a generalizability theory framework. Applied Measurement in Education, 28(3), 169-186.
  • Lewis, D. M., Mitzel, H. C., & Green, D. R. (1996). Standard setting: A bookmark approach. In D. R. Green (Ed.), IRT-based standard setting procedures utilizing behavioral anchoring. Symposium conducted at the Council of Chief State School Officers National Conference on Large-Scale Assessment, Phoenix, AZ.
  • MEB (2009). İlköğretim matematik dersi 6-8. sınıflar öğretim programı ve kılavuzu. [Elementary mathematics course curriculum and guide of 6-8. classes]. Retrieved November 29, 2019, from https://ttkb.meb.gov.tr .
  • Mehrens, W. A. (1995). Methodological issues in standard setting for educational exams. In Proceedings of Joint Conference on Standard Setting for Large-Scale Assessments (pp. 221-263). Washington, DC: National Assessment Governing Board and National Center for Education Statistics.
  • Norcini, J., Shea, J., & Ping, J. C. (1988). A note on the application of multiple matrix sampling to standard setting. Journal of Educational Measurement, 25(2), 159–164.
  • Özçelik, D. A. (2013). Test Hazırlama Kılavuzu [Test Preparation Guide]. Pegem Akademi Yayıncılık.
  • Pallant, J. (2005). SPSS survival manual: A step by step guide to data analysis using SPSS for Windows (2nd ed.). Crows Nest, Australia: Allen & Unwin.
  • Plake, B. S., & Impara, J. C. (2001). The fourteenth mental measurements yearbook. Lincoln, NB: Buros Institute of Mental Measurements.
  • Reckase, M. D. (2001). Innovative methods for helping standard-setting participants to perform their task. The role of feedback regarding consistency, accuracy, and impact. In G. J. Cizek (Ed.), Setting performance standards: Concepts, methods, and perspectives (pp.159-174). Mahwah, NJ: Erlbaum.
  • Sireci, S. G., Patelis, T., Rizavi, S., Dillingham, A. M., & Rodriguez, G. (2000). Setting standards on a computerized-adaptive placement examination. Laboratory or Psychometric and Evaluative Research Report No. 378.
  • Smith, T. N. (2011). Using stratified item selection to reduce the number of items rated in standard setting. University of South Florida, USA.
  • Siegel, S. (1956). Nonparametric methods for the behavioral sciences. New York.

Comparison of Passing Scores Determined by The Angoff Method in Different Item Samples

Yıl 2020, , 80 - 97, 01.04.2020
https://doi.org/10.21449/ijate.699479

Öz

In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling (SRS), content stratified (C-SRS), item-difficulty stratified (D-SRS) and content-by-difficulty random sampling (CD-SRS) methods were used to constitute different length of subsets (30%, 40%, 50%, 70%) from the full-test. In total, 16 different study conditions (4 methods x 4 subsets) were investigated. In data analysis part, ANOVA analysis was conducted to examine whether minimum passing scores (MPSs) for the subsets were significantly different from the MPSs of the full-length test. As a follow-up analysis, RMSE and SEE (Standard Error of Estimation) values were calculated for each study condition. Results indicated that the estimated Angoff MPSs were significantly different from the full-test Angoff MPS (45.12) only in the study conditions of 30%-C-SRS, 40% C-SRS, 30% D-SRS and 30%-CD-SRS. According to RMSE values, the C-SRS method had the smallest error while the SRS method had the biggest one. Moreover, SEE examinations revealed that to achieve estimations similar to the full-test Angoff MPS (within one SEE), it is sufficient to get 50% of items with the C-SRS method. C-SRS method was the more effective one compared to the others in reducing the number of items rated by judges in MPS setting studies conducted with the Angoff method.

Kaynakça

  • Behuniak, P., Gable, R. K., &Archambault, F. X. (1982). The validity of categorized proficiency test scores. Educational and Psychological Measurement, 42, 247-252.
  • Berk, R. A. (1996). Standard setting: The next generation (where few pschometricians have gone before!). Applied Measurement in Education, 9, 215–235.
  • Buckendahl, C. W., Ferdous, A. A. & Gerrow, J. (2010). Recommending cut scores with a subset of items: An empirical illustration. Practical Assessment, 15(6), 1-10.
  • Cizek, G. J. (2001). Setting performance standards: Concepts, methods and perspectives. Mahwah, NJ: Lawrence Erlbaum Associates.
  • Çetin, S. (2011). İşaretleme ve angoff standart belirleme yöntemlerinin karşılaştırılması [Comparison of Bookmark and Angoff Standard Setting Methods]. PhD dissertation, Hacettepe University, Ankara.
  • Downing, S. M. (2006). Selected-Response item formats in test development. In T. M. Haladyna & S. M. Downing (Ed.), Handbook of test development (pp. 287-300). Mahwah, New Jersey: Routledge.
  • Ferdous, A. A., & Plake, B. S. (2005). The use of subsets of test questions in an Angoff standard setting method. Educational and Psychological Measurement, 65(2), 185-201.
  • Ferdous, A. A., & Plake, B. S. (2007). Item selection strategy for reducing the number of items rated in an Angoff standard setting study. Educational and Psychological Measurement, 67(2), 193-206.
  • Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to design and evaluate research in education (8th ed.). New York: McGraw Hill.
  • Hambleton, R. M. (1998). Setting performance standards on achievement tests: Meeting the requirements of Title I. In L. N. Hansche (Ed.), Handbook for the development of performance standards (pp. 87-114). Washington, DC: Council of Chief State School Officers.
  • Hambleton, R. K., & Pitoniak, M. (2006). Setting performance standards. In R. L. Brennan (Ed.), Educational Measurement (pp. 433–470). Westport, CT: Praeger.
  • Impara, J. C., & Plake, B. S. (1997). Standard setting: An alternative approach. Journal of Educational Measurement, 34, 353-366.
  • Irwin, P. (2007). An alternative examinee-centered standard setting strategy (Doctoral dissertation). University of Nebraska, USA.
  • Jaeger, R. M. (1989). Certification of student competence. In R. L. Linn (Ed.), Educational measurement (pp. 485-514). New York: American Council on Education/Macmillan.
  • Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational Measurement, 38(4), 319-342.
  • Kannan, P., Sgammato, A., & Tannenbaum, R. J. (2015). Evaluating the operational feasibility of using subsets of items to recommend minimal competency cut scores. Applied Measurement in Education, 28(4), 292-307.
  • Kannan, P., Sgammato, A., Tannenbaum, R. J., & Katz, I. R. (2015). Evaluating the consistency of angoff-based cut scores using subsets of items within a generalizability theory framework. Applied Measurement in Education, 28(3), 169-186.
  • Lewis, D. M., Mitzel, H. C., & Green, D. R. (1996). Standard setting: A bookmark approach. In D. R. Green (Ed.), IRT-based standard setting procedures utilizing behavioral anchoring. Symposium conducted at the Council of Chief State School Officers National Conference on Large-Scale Assessment, Phoenix, AZ.
  • MEB (2009). İlköğretim matematik dersi 6-8. sınıflar öğretim programı ve kılavuzu. [Elementary mathematics course curriculum and guide of 6-8. classes]. Retrieved November 29, 2019, from https://ttkb.meb.gov.tr .
  • Mehrens, W. A. (1995). Methodological issues in standard setting for educational exams. In Proceedings of Joint Conference on Standard Setting for Large-Scale Assessments (pp. 221-263). Washington, DC: National Assessment Governing Board and National Center for Education Statistics.
  • Norcini, J., Shea, J., & Ping, J. C. (1988). A note on the application of multiple matrix sampling to standard setting. Journal of Educational Measurement, 25(2), 159–164.
  • Özçelik, D. A. (2013). Test Hazırlama Kılavuzu [Test Preparation Guide]. Pegem Akademi Yayıncılık.
  • Pallant, J. (2005). SPSS survival manual: A step by step guide to data analysis using SPSS for Windows (2nd ed.). Crows Nest, Australia: Allen & Unwin.
  • Plake, B. S., & Impara, J. C. (2001). The fourteenth mental measurements yearbook. Lincoln, NB: Buros Institute of Mental Measurements.
  • Reckase, M. D. (2001). Innovative methods for helping standard-setting participants to perform their task. The role of feedback regarding consistency, accuracy, and impact. In G. J. Cizek (Ed.), Setting performance standards: Concepts, methods, and perspectives (pp.159-174). Mahwah, NJ: Erlbaum.
  • Sireci, S. G., Patelis, T., Rizavi, S., Dillingham, A. M., & Rodriguez, G. (2000). Setting standards on a computerized-adaptive placement examination. Laboratory or Psychometric and Evaluative Research Report No. 378.
  • Smith, T. N. (2011). Using stratified item selection to reduce the number of items rated in standard setting. University of South Florida, USA.
  • Siegel, S. (1956). Nonparametric methods for the behavioral sciences. New York.
Toplam 28 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Eğitim Üzerine Çalışmalar
Bölüm Makaleler
Yazarlar

Hakan Kara Bu kişi benim 0000-0002-2396-3462

Sevda Çetin Bu kişi benim 0000-0001-5483-595X

Yayımlanma Tarihi 1 Nisan 2020
Gönderilme Tarihi 1 Ekim 2019
Yayımlandığı Sayı Yıl 2020

Kaynak Göster

APA Kara, H., & Çetin, S. (2020). Comparison of Passing Scores Determined by The Angoff Method in Different Item Samples. International Journal of Assessment Tools in Education, 7(1), 80-97. https://doi.org/10.21449/ijate.699479

23823             23825             23824