Research Article
BibTex RIS Cite

Automatic item generation for online measurement and evaluation: Turkish literature items

Year 2023, , 218 - 231, 26.06.2023
https://doi.org/10.21449/ijate.1249297

Abstract

Developments in the field of education have significantly affected test development processes, and computer-based test applications have been started in many institutions. In our country, research on the application of measurement and evaluation tools in the computer environment for use with distance education is gaining momentum. A large pool of items is required for computer-based testing applications that provide significant advantages to practitioners and test takers. Preparing a large pool of items also requires more effort in terms of time, effort, and cost. To overcome this problem, automatic item generation has been widely used by bringing together item development subject matter experts and computer technology. In the present research, the steps for implementing automatic item generation are explained through an example. In the research, which was based on the fundamental research method, first a total of 2560 items were generated using computer technology and SMEs in field of Turkish literature. In the second stage, 60 randomly selected items were examined. As a result of the research, it was determined that a large item pool could be created to be used in online measurement and evaluation applications using automatic item generation.

Thanks

This study was prepared at the University of Alberta, where we continue our studies within the scope of 2219 - Postdoctoral Research Fellowship Program. We thank TÜBİTAK for their contributions. We thank TÜBİTAK for their contributions.

References

  • Adıgüzel, A. (2020). Teachers’ views on distance education and evaluation of student success in the pandemic process. Milli Eğitim Dergisi, 49(1), 253 271. https://doi.org/10.37669/milliegitim.781998
  • Alves, C.B., Gierl, M.J., & Lai, H. (2010). Using automated item generation to promote principled test design and development. American Educational Research Association, Denver, CO, USA.
  • Arendasy, M.E., & Sommer, M. (2012). Using automatic item generation to meet the increasing item demands of high-stakes educational and occupational assessment. Learning and individual differences, 22(1), 112-117.
  • Artırma, F., & Hareketi, T.İ. (2020). FATİH projesi. Retrieved from: http://fatihprojesi.meb.gov.tr
  • ASGOIG. (2022). Automatic item generation software. https://asgoig.com/
  • Bai, Y. (2019). Cognitive Diagnostic Models-based Automatic Item Generation: Item Feature Exploration and Calibration Model Selection. Columbia University.
  • Balta, Y., & Türel, Y. (2013). An examination on various measurement and evaluation methods used in online distance education. Turkish Studies-International Periodical For The Languages, Literature and History of Turkish or Turkic, 8(3), 37-45. http://dx.doi.org/10.7827/TurkishStudies.427
  • Bayardo, R.J., Ma, Y., & Srikant, R. (2007). Scaling up all pairs similarity search. Proceedings of the 16th international conference on World Wide Web.
  • Bennett, R.E. (2011). Formative assessment: A critical review. Assessment in education: principles, policy & practice, 18(1), 5-25.
  • Chen, B., Zilles, C., West, M., & Bretl, T. (2019). Effect of discrete and continuous parameter variation on difficulty in automatic item generation. Artificial Intelligence in Education: 20th International Conference, AIED 2019, Chicago, IL, USA, June 25-29, Proceedings, Part I 20,
  • Choi, J., Kim, H., & Pak, S. (2018). Evaluation of Automatic Item Generation Utilities in Formative Assessment Application for Korean High School Students. Journal of Educational Issues, 4(1), 68-89.
  • Choi, J., & Zhang, X. (2019). Computerized item modeling practices using computer adaptive formative assessment automatic item generation system: A tutorial. The Quantitative Methods for Psychology, 15(3), 214-225.
  • Clark, C.M., & Rust, F.O.C. (2006). Learning-centered assessment in teacher education. Studies in Educational Evaluation, 32(1), 73-82.
  • Colvin, K.F. (2014). Effect of automatic item generation on ability estimates in a multistage test. University of Massachusetts Amherst.
  • Corbett, A.T., & Anderson, J.R. (2001). Locus of feedback control in computer-based tutoring: Impact on learning rate, achievement and attitudes. Proceedings of the SIGCHI conference on Human factors in computing systems,
  • Davey, T. (2011). A Guide to Computer Adaptive Testing Systems. Council of Chief State School Officers.
  • Embretson, S., & Yang, X. (2006). 23 Automatic item generation and cognitive psychology. Handbook of statistics, 26, 747-768.
  • Freund, P.A., Hofer, S., & Holling, H. (2008). Explaining and controlling for the psychometric properties of computer-generated figural matrix items. Applied psychological measurement, 32(3), 195-210.
  • Gaytan, J., & McEwen, B.C. (2007). Effective online instructional and assessment strategies. The American journal of distance education, 21(3), 117-132.
  • Gierl, M.J., & Haladyna, T.M. (2012). Automatic item generation: Theory and practice. Routledge.
  • Gierl, M.J., & Lai, H. (2012). The role of item models in automatic item generation. International journal of testing, 12(3), 273-298.
  • Gierl, M.J., & Lai, H. (2013). Instructional topics in educational measurement (ITEMS) module: Using automated processes to generate test items. Educational Measurement: Issues and Practice, 32(3), 36-50.
  • Gierl, M.J., & Lai, H. (2018). Using automatic item generation to create solutions and rationales for computerized formative testing. Applied psychological measurement, 42(1), 42-57.
  • Gierl, M.J., Lai, H., Pugh, D., Touchie, C., Boulais, A.-P., & De Champlain, A. (2016). Evaluating the psychometric characteristics of generated multiple-choice test items. Applied Measurement in Education, 29(3), 196-210.
  • Gierl, M.J., Lai, H., & Tanygin, V. (2021). Advanced methods in automatic item generation. Routledge.
  • Gierl, M.J., Shin, J., Firoozi, T., & Lai, H. (2022). Using content coding and automatic item generation to improve test security. Frontiers in Education,
  • Gutl, C., Lankmayr, K., Weinhofer, J., & Hofler, M. (2011). Enhanced Automatic Question Creator--EAQC: Concept, Development and Evaluation of an Automatic Test Item Creation Tool to Foster Modern e-Education. Electronic Journal of e-Learning, 9(1), 23-38.
  • Haladyna, T.M., & Rodriguez, M.C. (2013). Developing and validating test items.
  • Higgins, D. (2007). Item Distiller: Text retrieval for computer-assisted test item creation. Educational Testing Service Research Memorandum (RM-07-05). Princeton, NJ: Educational Testing Service.
  • Higgins, D., Futagi, Y., & Deane, P. (2005). Multilingual generalization of the ModelCreator software for math item generation. ETS Research Report Series, 2005(1), i-38.
  • Hommel, B.E., Wollang, F.-J. M., Kotova, V., Zacher, H., & Schmukle, S. C. (2022). Transformer-based deep neural language modeling for construct-specific automatic item generation. psychometrika, 87(2), 749-772.
  • Irvine, S., & Kyllonen, P. (2002). Generating items for cognitive tests: Theory and practice. In: Mahwah, NJ: Erlbaum.
  • Kaptan, S. (1998). Bilimsel araştırma teknikleri ve istatiksel yöntemleri. Tekışık Ofset Tesisleri.
  • Karatay, H., & Dilekçi, A. (2019). Competencies of turkish teachers in measuring and evaluating language skills. Milli Eğitim Dergisi, 48(1), 685-716.
  • Kınalıoğlu, İ.H., & Güven, Ş. (2011). Issues and solutions on measurement of student achievement in distance education. XIII. Akademik Bilişim Konferansı Bildiriler, 637-644.
  • Kosh, A.E., Simpson, M.A., Bickel, L., Kellogg, M., & Sanford‐Moore, E. (2019). A cost–benefit analysis of automatic item generation. Educational Measurement: Issues and Practice, 38(1), 48-53.
  • Lai, H., Gierl, M.J., Touchie, C., Pugh, D., Boulais, A.-P., & De Champlain, A. (2016). Using automatic item generation to improve the quality of MCQ distractors. Teaching and learning in medicine, 28(2), 166-173.
  • Lane, S., Raymond, M.R., Haladyna, T.M., & Downing, S.M. (2015). Test development process. In Handbook of test development (pp. 19-34). Routledge.
  • MEB. (2018). Ortaöğretim Türk dili ve edebiyatı dersi (9, 10, 11 ve 12. sınıflar) öğretim programı. http://mufredat.meb.gov.tr/ProgramDetay.aspx?PID=353
  • MEB. (2020). EBA Yeni Dönem. https://yegitek.meb.gov.tr/www/egitim-bilisim-aginin-eba-yeni-donem-lansmani-gerceklesti/icerik/2999
  • MEB. (2021). Çevrim içi sınav. https://www.meb.gov.tr/yuz-yuze-egitime-bir-haftalik-aradan-sonra-devam/haber/24621/tr
  • ÖSYM. (2018). AYT örnek soruları. https://www.osym.gov.tr/TR,13680/2018.html
  • ÖSYM. (2022). E-YDS. https://www.osym.gov.tr/TR,25238/2023.html
  • Özyalçın, K.E., & Kana, F. (2020). An evaluation on the skills of writing sub-text questions of teachers of Turkish as a foreign language. Çukurova University Journal of Turkology Research (ÇÜTAD), 5(2), 488-506.
  • Parshall, C.G., Spray, J.A., Kalohn, J., & Davey, T. (2002). Practical considerations in computer-based testing. Springer Science & Business Media.
  • Poinstingl, H. (2009). The Linear Logistic Test Model (LLTM) as the methodological foundation of item generating rules for a new verbal reasoning test. Psychological Test and Assessment Modeling, 51(2), 123.
  • Rodriguez, M.C. (2005). Three options are optimal for multiple‐choice items: A meta‐analysis of 80 years of research. Educational Measurement: Issues and Practice, 24(2), 3-13.
  • Ryoo, J.H., Park, S., Suh, H., Choi, J., & Kwon, J. (2022). Development of a New Measure of Cognitive Ability Using Automatic Item Generation and Its Psychometric Properties. SAGE Open, 12(2), 21582440221095016.
  • Sarı, T., & Nayır, F. (2020). Education in the pandemic period: Challenges and opportunities. Electronic Turkish Studies, 15(4). http://dx.doi.org/10.7827/TurkishStudies.44335
  • Saygı, H. (2021). Problems encountered by classroom teachers in the covid-19 pandemic distance education process. Açıköğretim Uygulamaları ve Araştırmaları Dergisi, 7(2), 109-129. https://doi.org/10.51948/auad.841632
  • Singley, M.K., & Bennett, R.E. (2002). Item generation and beyond: Applications of schema theory to mathematics assessment. Generating Items for Cognitive Tests: Theory and Practice., Nov, 1998, Educational Testing Service, Princeton, NJ, US; This chapter was presented at the aforementioned conference.,
  • Sinharay, S., & Johnson, M. (2005). Analysis of Data from an Admissions Test with Item Models. Research Report. ETS RR-05-06. ETS Research Report Series.
  • Sun, L., Liu, Y., & Luo, F. (2019). Automatic generation of number series reasoning items of high difficulty. Frontiers in Psychology, 10, 884.
  • TDV. (2022). Türk İslam Ansiklopedisi. In https://islamansiklopedisi.org.tr/
  • TEDMEM. (2020). 2020 eğitim değerlendirme raporu (TEDMEM Değerlendirme Dizisi, Issue.
  • Weber, B., Schneider, B., Fritze, J., Gille, B., Hornung, S., Kühner, T., & Maurer, K. (2003). Acceptance of computerized compared to paper-and-pencil assessment in psychiatric inpatients. Computers in Human Behavior, 19(1), 81-93.
  • Yang, A.C., Chen, I.Y., Flanagan, B., & Ogata, H. (2021). Automatic generation of cloze items for repeated testing to improve reading comprehension. Educational Technology & Society, 24(3), 147-158.
  • Zhu, M., Liu, O.L., & Lee, H.-S. (2020). The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing. Computers & Education, 143, 103668.

Automatic item generation for online measurement and evaluation: Turkish literature items

Year 2023, , 218 - 231, 26.06.2023
https://doi.org/10.21449/ijate.1249297

Abstract

Developments in the field of education have significantly affected test development processes, and computer-based test applications have been started in many institutions. In our country, research on the application of measurement and evaluation tools in the computer environment for use with distance education is gaining momentum. A large pool of items is required for computer-based testing applications that provide significant advantages to practitioners and test takers. Preparing a large pool of items also requires more effort in terms of time, effort, and cost. To overcome this problem, automatic item generation has been widely used by bringing together item development subject matter experts and computer technology. In the present research, the steps for implementing automatic item generation are explained through an example. In the research, which was based on the fundamental research method, first a total of 2560 items were generated using computer technology and SMEs in field of Turkish literature. In the second stage, 60 randomly selected items were examined. As a result of the research, it was determined that a large item pool could be created to be used in online measurement and evaluation applications using automatic item generation.

References

  • Adıgüzel, A. (2020). Teachers’ views on distance education and evaluation of student success in the pandemic process. Milli Eğitim Dergisi, 49(1), 253 271. https://doi.org/10.37669/milliegitim.781998
  • Alves, C.B., Gierl, M.J., & Lai, H. (2010). Using automated item generation to promote principled test design and development. American Educational Research Association, Denver, CO, USA.
  • Arendasy, M.E., & Sommer, M. (2012). Using automatic item generation to meet the increasing item demands of high-stakes educational and occupational assessment. Learning and individual differences, 22(1), 112-117.
  • Artırma, F., & Hareketi, T.İ. (2020). FATİH projesi. Retrieved from: http://fatihprojesi.meb.gov.tr
  • ASGOIG. (2022). Automatic item generation software. https://asgoig.com/
  • Bai, Y. (2019). Cognitive Diagnostic Models-based Automatic Item Generation: Item Feature Exploration and Calibration Model Selection. Columbia University.
  • Balta, Y., & Türel, Y. (2013). An examination on various measurement and evaluation methods used in online distance education. Turkish Studies-International Periodical For The Languages, Literature and History of Turkish or Turkic, 8(3), 37-45. http://dx.doi.org/10.7827/TurkishStudies.427
  • Bayardo, R.J., Ma, Y., & Srikant, R. (2007). Scaling up all pairs similarity search. Proceedings of the 16th international conference on World Wide Web.
  • Bennett, R.E. (2011). Formative assessment: A critical review. Assessment in education: principles, policy & practice, 18(1), 5-25.
  • Chen, B., Zilles, C., West, M., & Bretl, T. (2019). Effect of discrete and continuous parameter variation on difficulty in automatic item generation. Artificial Intelligence in Education: 20th International Conference, AIED 2019, Chicago, IL, USA, June 25-29, Proceedings, Part I 20,
  • Choi, J., Kim, H., & Pak, S. (2018). Evaluation of Automatic Item Generation Utilities in Formative Assessment Application for Korean High School Students. Journal of Educational Issues, 4(1), 68-89.
  • Choi, J., & Zhang, X. (2019). Computerized item modeling practices using computer adaptive formative assessment automatic item generation system: A tutorial. The Quantitative Methods for Psychology, 15(3), 214-225.
  • Clark, C.M., & Rust, F.O.C. (2006). Learning-centered assessment in teacher education. Studies in Educational Evaluation, 32(1), 73-82.
  • Colvin, K.F. (2014). Effect of automatic item generation on ability estimates in a multistage test. University of Massachusetts Amherst.
  • Corbett, A.T., & Anderson, J.R. (2001). Locus of feedback control in computer-based tutoring: Impact on learning rate, achievement and attitudes. Proceedings of the SIGCHI conference on Human factors in computing systems,
  • Davey, T. (2011). A Guide to Computer Adaptive Testing Systems. Council of Chief State School Officers.
  • Embretson, S., & Yang, X. (2006). 23 Automatic item generation and cognitive psychology. Handbook of statistics, 26, 747-768.
  • Freund, P.A., Hofer, S., & Holling, H. (2008). Explaining and controlling for the psychometric properties of computer-generated figural matrix items. Applied psychological measurement, 32(3), 195-210.
  • Gaytan, J., & McEwen, B.C. (2007). Effective online instructional and assessment strategies. The American journal of distance education, 21(3), 117-132.
  • Gierl, M.J., & Haladyna, T.M. (2012). Automatic item generation: Theory and practice. Routledge.
  • Gierl, M.J., & Lai, H. (2012). The role of item models in automatic item generation. International journal of testing, 12(3), 273-298.
  • Gierl, M.J., & Lai, H. (2013). Instructional topics in educational measurement (ITEMS) module: Using automated processes to generate test items. Educational Measurement: Issues and Practice, 32(3), 36-50.
  • Gierl, M.J., & Lai, H. (2018). Using automatic item generation to create solutions and rationales for computerized formative testing. Applied psychological measurement, 42(1), 42-57.
  • Gierl, M.J., Lai, H., Pugh, D., Touchie, C., Boulais, A.-P., & De Champlain, A. (2016). Evaluating the psychometric characteristics of generated multiple-choice test items. Applied Measurement in Education, 29(3), 196-210.
  • Gierl, M.J., Lai, H., & Tanygin, V. (2021). Advanced methods in automatic item generation. Routledge.
  • Gierl, M.J., Shin, J., Firoozi, T., & Lai, H. (2022). Using content coding and automatic item generation to improve test security. Frontiers in Education,
  • Gutl, C., Lankmayr, K., Weinhofer, J., & Hofler, M. (2011). Enhanced Automatic Question Creator--EAQC: Concept, Development and Evaluation of an Automatic Test Item Creation Tool to Foster Modern e-Education. Electronic Journal of e-Learning, 9(1), 23-38.
  • Haladyna, T.M., & Rodriguez, M.C. (2013). Developing and validating test items.
  • Higgins, D. (2007). Item Distiller: Text retrieval for computer-assisted test item creation. Educational Testing Service Research Memorandum (RM-07-05). Princeton, NJ: Educational Testing Service.
  • Higgins, D., Futagi, Y., & Deane, P. (2005). Multilingual generalization of the ModelCreator software for math item generation. ETS Research Report Series, 2005(1), i-38.
  • Hommel, B.E., Wollang, F.-J. M., Kotova, V., Zacher, H., & Schmukle, S. C. (2022). Transformer-based deep neural language modeling for construct-specific automatic item generation. psychometrika, 87(2), 749-772.
  • Irvine, S., & Kyllonen, P. (2002). Generating items for cognitive tests: Theory and practice. In: Mahwah, NJ: Erlbaum.
  • Kaptan, S. (1998). Bilimsel araştırma teknikleri ve istatiksel yöntemleri. Tekışık Ofset Tesisleri.
  • Karatay, H., & Dilekçi, A. (2019). Competencies of turkish teachers in measuring and evaluating language skills. Milli Eğitim Dergisi, 48(1), 685-716.
  • Kınalıoğlu, İ.H., & Güven, Ş. (2011). Issues and solutions on measurement of student achievement in distance education. XIII. Akademik Bilişim Konferansı Bildiriler, 637-644.
  • Kosh, A.E., Simpson, M.A., Bickel, L., Kellogg, M., & Sanford‐Moore, E. (2019). A cost–benefit analysis of automatic item generation. Educational Measurement: Issues and Practice, 38(1), 48-53.
  • Lai, H., Gierl, M.J., Touchie, C., Pugh, D., Boulais, A.-P., & De Champlain, A. (2016). Using automatic item generation to improve the quality of MCQ distractors. Teaching and learning in medicine, 28(2), 166-173.
  • Lane, S., Raymond, M.R., Haladyna, T.M., & Downing, S.M. (2015). Test development process. In Handbook of test development (pp. 19-34). Routledge.
  • MEB. (2018). Ortaöğretim Türk dili ve edebiyatı dersi (9, 10, 11 ve 12. sınıflar) öğretim programı. http://mufredat.meb.gov.tr/ProgramDetay.aspx?PID=353
  • MEB. (2020). EBA Yeni Dönem. https://yegitek.meb.gov.tr/www/egitim-bilisim-aginin-eba-yeni-donem-lansmani-gerceklesti/icerik/2999
  • MEB. (2021). Çevrim içi sınav. https://www.meb.gov.tr/yuz-yuze-egitime-bir-haftalik-aradan-sonra-devam/haber/24621/tr
  • ÖSYM. (2018). AYT örnek soruları. https://www.osym.gov.tr/TR,13680/2018.html
  • ÖSYM. (2022). E-YDS. https://www.osym.gov.tr/TR,25238/2023.html
  • Özyalçın, K.E., & Kana, F. (2020). An evaluation on the skills of writing sub-text questions of teachers of Turkish as a foreign language. Çukurova University Journal of Turkology Research (ÇÜTAD), 5(2), 488-506.
  • Parshall, C.G., Spray, J.A., Kalohn, J., & Davey, T. (2002). Practical considerations in computer-based testing. Springer Science & Business Media.
  • Poinstingl, H. (2009). The Linear Logistic Test Model (LLTM) as the methodological foundation of item generating rules for a new verbal reasoning test. Psychological Test and Assessment Modeling, 51(2), 123.
  • Rodriguez, M.C. (2005). Three options are optimal for multiple‐choice items: A meta‐analysis of 80 years of research. Educational Measurement: Issues and Practice, 24(2), 3-13.
  • Ryoo, J.H., Park, S., Suh, H., Choi, J., & Kwon, J. (2022). Development of a New Measure of Cognitive Ability Using Automatic Item Generation and Its Psychometric Properties. SAGE Open, 12(2), 21582440221095016.
  • Sarı, T., & Nayır, F. (2020). Education in the pandemic period: Challenges and opportunities. Electronic Turkish Studies, 15(4). http://dx.doi.org/10.7827/TurkishStudies.44335
  • Saygı, H. (2021). Problems encountered by classroom teachers in the covid-19 pandemic distance education process. Açıköğretim Uygulamaları ve Araştırmaları Dergisi, 7(2), 109-129. https://doi.org/10.51948/auad.841632
  • Singley, M.K., & Bennett, R.E. (2002). Item generation and beyond: Applications of schema theory to mathematics assessment. Generating Items for Cognitive Tests: Theory and Practice., Nov, 1998, Educational Testing Service, Princeton, NJ, US; This chapter was presented at the aforementioned conference.,
  • Sinharay, S., & Johnson, M. (2005). Analysis of Data from an Admissions Test with Item Models. Research Report. ETS RR-05-06. ETS Research Report Series.
  • Sun, L., Liu, Y., & Luo, F. (2019). Automatic generation of number series reasoning items of high difficulty. Frontiers in Psychology, 10, 884.
  • TDV. (2022). Türk İslam Ansiklopedisi. In https://islamansiklopedisi.org.tr/
  • TEDMEM. (2020). 2020 eğitim değerlendirme raporu (TEDMEM Değerlendirme Dizisi, Issue.
  • Weber, B., Schneider, B., Fritze, J., Gille, B., Hornung, S., Kühner, T., & Maurer, K. (2003). Acceptance of computerized compared to paper-and-pencil assessment in psychiatric inpatients. Computers in Human Behavior, 19(1), 81-93.
  • Yang, A.C., Chen, I.Y., Flanagan, B., & Ogata, H. (2021). Automatic generation of cloze items for repeated testing to improve reading comprehension. Educational Technology & Society, 24(3), 147-158.
  • Zhu, M., Liu, O.L., & Lee, H.-S. (2020). The effect of automated feedback on revision behavior and learning gains in formative assessment of scientific argument writing. Computers & Education, 143, 103668.
There are 58 citations in total.

Details

Primary Language English
Subjects Other Fields of Education
Journal Section Articles
Authors

Ayfer Sayın 0000-0003-1357-5674

Mark J. Gıerl This is me 0000-0002-2653-1761

Publication Date June 26, 2023
Submission Date February 9, 2023
Published in Issue Year 2023

Cite

APA Sayın, A., & Gıerl, M. J. (2023). Automatic item generation for online measurement and evaluation: Turkish literature items. International Journal of Assessment Tools in Education, 10(2), 218-231. https://doi.org/10.21449/ijate.1249297

Cited By

23823             23825             23824