This study explores the efficacy of ChatGPT-3.5, an AI chatbot, used as an Automatic Essay Scoring (AES) system and feedback provider for IELTS essay preparation. It investigates the alignment between scores given by ChatGPT-3.5 and those assigned by official IELTS examiners to establish its reliability as an AES. It also identifies the strategies employed by ChatGPT-3.5 in revising essays based on the four IELTS rubrics: task achievement, coherence and cohesion, lexical resources, and grammatical range and accuracy. Based on pre-rated essays from an official IELTS preparatory book as a control measure to ensure objectivity, the findings indicate a discrepancy, with ChatGPT-3.5 typically assigning lower scores compared to official raters. However, ChatGPT-3.5 shows a robust capability to revise essays across all four descriptors. In addition, the effectiveness of ChatGPT-3.5 as a feedback provider may be attributed to the essay type and its widely accepted rubrics. Our study contributes to the understanding of the application of AI tools in second language writing and suggests that future studies should focus on evaluating the capacity and effectiveness of such tools in pedagogical applications.
This research was partly funded by the BNU-HKBU United International College Startup Grant UICR0700033-22.
References
Artiles Rodríguez, J., Guerra Santana, M., Aguiar Perera, V., & Rodríguez Pulido, J. (2021). Agente conversacional virtual: La inteligencia artificial para el aprendizaje autónomo. Pixel-Bit, Revista de Medios y Educación, 62, 107–144. https://doi.org/10.12795/pixelbit.86171
Bai, J.Y.H., Zawacki-Richter, O., Bozkurt, A., Lee, K., Fanguy, M., Cefa Sari, B., & Marin, V.I. (2022, September). Automated Essay Scoring (AES) Systems: Opportunities and challenges for open and distance education. Tenth Pan-Commonwealth Forum on Open Learning. https://doi.org/10.56059/pcf10.8339
Barrot, J.S. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Assessing Writing, 57, 100745. https://doi.org/10.1016/j.asw.2023.100745
Bašić, Z., Banovac, A., Kruzic, I., & Jerkovic, I. (2023). Better by you, better than me: ChatGPT3 as writing assistance in student essays. https://doi.org/10.48550/ARXIV.2302.04536
Baskara, R. (2023). Integrating ChatGPT into EFL writing instruction: Benefits and challenges. International Journal of Education and Learning, 5(1), 44 55. https://doi.org/10.31763/ijele.v5i1.858
Boa Sorte, P., Farias, M.A. de F., Santos, A. E., Santos, J. do C.A., & Dias, J.S. dos S.R. (2021). Artificial intelligence in academic writing: What is the CPT-3 algorithm? Revista EntreLinguas, 7, e021035.
Chiu, T.K.F., Xia, Q., Zhou, X., Chai, C.S., & Cheng, M. (2023). Systematic literature review of opportunities, challenges, and future research recommendations of artificial intelligence in education. Computers and Education: Artificial Intelligence, 4, 100118. https://doi.org/10.1016/j.caeai.2022.100118
Cotos, E. (2023). Automated feedback on writing. In O. Kruse, C. Rapp, C.M. Anson, K. Benetos, E. Cotos, A. Devitt, & A. Shibani (Eds.), Digital writing technologies in higher education (pp. 347–364). Springer International.
Crawley, M.J. (2012). The R book. John Wiley & Sons.
Dergaa, I., Chamari, K., Zmijewski, P., & Ben Saad, H. (2023). From human writing to artificial intelligence generated text: Examining the prospects and potential threats of ChatGPT in academic writing. Biology of Sport, 40(2), 615 622. https://doi.org/10.5114/biolsport.2023.125623
Fryer, L.K., Coniam, D., Carpenter, R., & Lăpușneanu, D. (2020). Bots for language learning now: Current and future directions. Language Learning & Technology, 24(2), 8–22. http://hdl.handle.net/10125/44719
Fyfe, P. (2022). How to cheat on your final paper: Assigning AI for student writing. AI & Society, 38, 1395–1405. https://doi.org/10.1007/s00146-022-01397-z
Han, J., Yoo, H., Kim, Y., Myung, J., Kim, M., Lim, H., Kim, J., Lee, T. Y., Hong, H., Ahn, S.-Y., & Oh, A. (2023). RECIPE: How to integrate ChatGPT into EFL writing education. Proceedings of the Tenth ACM Conference on Learning @ Scale, 416–420. https://doi.org/10.1145/3573051.3596200
He, H. (2016). A survey of EFL college learners’ perceptions of an on-line writing program. International Journal of Emerging Technologies in Learning (Online), 11(4), 11-15. https://doi.org/10.3991/ijet.v11i04.5459
Huang, H.-W., Li, Z., & Taylor, L. (2020). The effectiveness of using Grammarly to improve students’ writing skills. Proceedings of the 5th International Conference on Distance Education and Learning, 122–127. https://doi.org/10.1145/3402569.3402594
Huang, W., Hew, K.F., & Fryer, L.K. (2022). Chatbots for language learning—Are they really useful? A systematic review of chatbot‐supported language learning. Journal of Computer Assisted Learning, 38(1), 237-257. https://doi.org/10.1111/jcal.12610
Huynh-Cam, T.-T., Agrawal, S., Bui, T.-T., Nalluri, V., & Chen, L.-S. (2023). Enhancing the English writing skills of in-service students using Marking Mate automated feedback. Asia Pacific Education Review, 25(2), 459–474. https://doi.org/10.1007/s12564-023-09904-7
Kohnke, L., Moorhouse, B.L., & Zou, D. (2023). ChatGPT for language teaching and learning. RELC Journal, 54(2), 537–550. https://doi.org/10.1177/00336882231162868
Kuhail, M.A., Alturki, N., Alramlawi, S., & Alhejori, K. (2023). Interacting with educational chatbots: A systematic review. Education and Information Technologies, 28(1), 973–1018. https://doi.org/10.1007/s10639-022-11177-3
Lee, S.M. (2023). The effectiveness of machine translation in foreign language education: A systematic review and meta-analysis. Computer Assisted Language Learning, 36(1-2), 103–125. https://doi.org/10.1080/09588221.2021.1901745
Li, Z. (2021). Teachers in automated writing evaluation (AWE) system-supported ESL writing classes: Perception, implementation, and influence. System, 99, 102505. https://doi.org/10.1016/j.system.2021.102505
Liang, W., Yuksekgonul, M., Mao, Y., Wu, E., & Zou, J. (2023). GPT detectors are biased against non native English writers. Patterns, 4(7), 100779. https://doi.org/10.1016/j.patter.2023.100779
Miles, M.B., & Huberman, A.M. (1994). Qualitative data analysis: An expanded sourcebook. Sage.
Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050
Open AI. (2022). Introducing ChatGPT. https://openai.com/blog/chatgpt
Pavlik, J.V. (2023). Collaborating with ChatGPT: Considering the implications of generative artificial intelligence for journalism and media education. Journalism & Mass Communication Educator, 78(1), 84–93. https://doi.org/10.1177/10776958221149577
Prado, M.C.A., & Huggins, T.J. (2023). Technological approaches to student participation while studying the history of psychology in an EMI institution. In J. Corbett, E.M.Y. Yan, J. Yeoh, & J. Lee (Eds.), Multilingual Education Yearbook 2023 (pp. 49–69). Springer International. https://doi.org/10.1007/978-3-031-32811-4_4
Ranalli, J. (2018). Automated written corrective feedback: How well can students make use of it? Computer Assisted Language Learning, 31(7), 653 674. https://doi.org/10.1080/09588221.2018.1428994
Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65. https://doi.org/10.1016/j.asw.2013.11.007
Test Statistics. (2022). IELTS. https://ielts.org/researchers/our research/test statistics#Demographic
Uysal, H.H. (2010). A critical review of the IELTS writing test. ELT Journal, 64(3), 314–320. https://doi.org/10.1093/elt/ccp026
Warschauer, M., Tseng, W., Yim, S., Webster, T., Jacob, S., Du, Q., & Tate, T. (2023). The affordances and contradictions of AI-generated text for writers of English as a second or foreign language. Journal of Second Language Writing, 62, 101071. https://doi.org/10.1016/j.jslw.2023.101071
Yang, X., & Dai, Y. (2015). An empirical study of college English autonomous writing teaching mode based on www.pigai.org. Technology Enhanced Foreign Language Education, 162(02), 17-23. (Translated from Chinese) https://doi.org/10.3969/j.issn.1001-5795.2015.02.003
Zhang, Z.V. (2020). Engaging with automated writing evaluation (AWE) feedback on L2 writing: Student perceptions and revisions. Assessing Writing, 43, 100439. https://doi.org/10.1016/j.asw.2019.100439
Zhou, Z., & Prado, M. (2024). A corpus-based comparative study of readability of passages in compulsory Chinese English textbooks and exams for middle school students. Proceedings of the 13th Int. Conf. on Educational and Information Technology. pp. 279-83. http://doi.org/10.1109/ICEIT61397.2024.10540975
ChatGPT-3.5 as an automatic scoring system and feedback provider in IELTS exams
Year 2025,
Volume: 12 Issue: 1, 62 - 77, 20.02.2025
This study explores the efficacy of ChatGPT-3.5, an AI chatbot, used as an Automatic Essay Scoring (AES) system and feedback provider for IELTS essay preparation. It investigates the alignment between scores given by ChatGPT-3.5 and those assigned by official IELTS examiners to establish its reliability as an AES. It also identifies the strategies employed by ChatGPT-3.5 in revising essays based on the four IELTS rubrics: task achievement, coherence and cohesion, lexical resources, and grammatical range and accuracy. Based on pre-rated essays from an official IELTS preparatory book as a control measure to ensure objectivity, the findings indicate a discrepancy, with ChatGPT-3.5 typically assigning lower scores compared to official raters. However, ChatGPT-3.5 shows a robust capability to revise essays across all four descriptors. In addition, the effectiveness of ChatGPT-3.5 as a feedback provider may be attributed to the essay type and its widely accepted rubrics. Our study contributes to the understanding of the application of AI tools in second language writing and suggests that future studies should focus on evaluating the capacity and effectiveness of such tools in pedagogical applications.
This research was partly funded by the BNU-HKBU United International College Startup Grant UICR0700033-22.
References
Artiles Rodríguez, J., Guerra Santana, M., Aguiar Perera, V., & Rodríguez Pulido, J. (2021). Agente conversacional virtual: La inteligencia artificial para el aprendizaje autónomo. Pixel-Bit, Revista de Medios y Educación, 62, 107–144. https://doi.org/10.12795/pixelbit.86171
Bai, J.Y.H., Zawacki-Richter, O., Bozkurt, A., Lee, K., Fanguy, M., Cefa Sari, B., & Marin, V.I. (2022, September). Automated Essay Scoring (AES) Systems: Opportunities and challenges for open and distance education. Tenth Pan-Commonwealth Forum on Open Learning. https://doi.org/10.56059/pcf10.8339
Barrot, J.S. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Assessing Writing, 57, 100745. https://doi.org/10.1016/j.asw.2023.100745
Bašić, Z., Banovac, A., Kruzic, I., & Jerkovic, I. (2023). Better by you, better than me: ChatGPT3 as writing assistance in student essays. https://doi.org/10.48550/ARXIV.2302.04536
Baskara, R. (2023). Integrating ChatGPT into EFL writing instruction: Benefits and challenges. International Journal of Education and Learning, 5(1), 44 55. https://doi.org/10.31763/ijele.v5i1.858
Boa Sorte, P., Farias, M.A. de F., Santos, A. E., Santos, J. do C.A., & Dias, J.S. dos S.R. (2021). Artificial intelligence in academic writing: What is the CPT-3 algorithm? Revista EntreLinguas, 7, e021035.
Chiu, T.K.F., Xia, Q., Zhou, X., Chai, C.S., & Cheng, M. (2023). Systematic literature review of opportunities, challenges, and future research recommendations of artificial intelligence in education. Computers and Education: Artificial Intelligence, 4, 100118. https://doi.org/10.1016/j.caeai.2022.100118
Cotos, E. (2023). Automated feedback on writing. In O. Kruse, C. Rapp, C.M. Anson, K. Benetos, E. Cotos, A. Devitt, & A. Shibani (Eds.), Digital writing technologies in higher education (pp. 347–364). Springer International.
Crawley, M.J. (2012). The R book. John Wiley & Sons.
Dergaa, I., Chamari, K., Zmijewski, P., & Ben Saad, H. (2023). From human writing to artificial intelligence generated text: Examining the prospects and potential threats of ChatGPT in academic writing. Biology of Sport, 40(2), 615 622. https://doi.org/10.5114/biolsport.2023.125623
Fryer, L.K., Coniam, D., Carpenter, R., & Lăpușneanu, D. (2020). Bots for language learning now: Current and future directions. Language Learning & Technology, 24(2), 8–22. http://hdl.handle.net/10125/44719
Fyfe, P. (2022). How to cheat on your final paper: Assigning AI for student writing. AI & Society, 38, 1395–1405. https://doi.org/10.1007/s00146-022-01397-z
Han, J., Yoo, H., Kim, Y., Myung, J., Kim, M., Lim, H., Kim, J., Lee, T. Y., Hong, H., Ahn, S.-Y., & Oh, A. (2023). RECIPE: How to integrate ChatGPT into EFL writing education. Proceedings of the Tenth ACM Conference on Learning @ Scale, 416–420. https://doi.org/10.1145/3573051.3596200
He, H. (2016). A survey of EFL college learners’ perceptions of an on-line writing program. International Journal of Emerging Technologies in Learning (Online), 11(4), 11-15. https://doi.org/10.3991/ijet.v11i04.5459
Huang, H.-W., Li, Z., & Taylor, L. (2020). The effectiveness of using Grammarly to improve students’ writing skills. Proceedings of the 5th International Conference on Distance Education and Learning, 122–127. https://doi.org/10.1145/3402569.3402594
Huang, W., Hew, K.F., & Fryer, L.K. (2022). Chatbots for language learning—Are they really useful? A systematic review of chatbot‐supported language learning. Journal of Computer Assisted Learning, 38(1), 237-257. https://doi.org/10.1111/jcal.12610
Huynh-Cam, T.-T., Agrawal, S., Bui, T.-T., Nalluri, V., & Chen, L.-S. (2023). Enhancing the English writing skills of in-service students using Marking Mate automated feedback. Asia Pacific Education Review, 25(2), 459–474. https://doi.org/10.1007/s12564-023-09904-7
Kohnke, L., Moorhouse, B.L., & Zou, D. (2023). ChatGPT for language teaching and learning. RELC Journal, 54(2), 537–550. https://doi.org/10.1177/00336882231162868
Kuhail, M.A., Alturki, N., Alramlawi, S., & Alhejori, K. (2023). Interacting with educational chatbots: A systematic review. Education and Information Technologies, 28(1), 973–1018. https://doi.org/10.1007/s10639-022-11177-3
Lee, S.M. (2023). The effectiveness of machine translation in foreign language education: A systematic review and meta-analysis. Computer Assisted Language Learning, 36(1-2), 103–125. https://doi.org/10.1080/09588221.2021.1901745
Li, Z. (2021). Teachers in automated writing evaluation (AWE) system-supported ESL writing classes: Perception, implementation, and influence. System, 99, 102505. https://doi.org/10.1016/j.system.2021.102505
Liang, W., Yuksekgonul, M., Mao, Y., Wu, E., & Zou, J. (2023). GPT detectors are biased against non native English writers. Patterns, 4(7), 100779. https://doi.org/10.1016/j.patter.2023.100779
Miles, M.B., & Huberman, A.M. (1994). Qualitative data analysis: An expanded sourcebook. Sage.
Mizumoto, A., & Eguchi, M. (2023). Exploring the potential of using an AI language model for automated essay scoring. Research Methods in Applied Linguistics, 2(2), 100050. https://doi.org/10.1016/j.rmal.2023.100050
Open AI. (2022). Introducing ChatGPT. https://openai.com/blog/chatgpt
Pavlik, J.V. (2023). Collaborating with ChatGPT: Considering the implications of generative artificial intelligence for journalism and media education. Journalism & Mass Communication Educator, 78(1), 84–93. https://doi.org/10.1177/10776958221149577
Prado, M.C.A., & Huggins, T.J. (2023). Technological approaches to student participation while studying the history of psychology in an EMI institution. In J. Corbett, E.M.Y. Yan, J. Yeoh, & J. Lee (Eds.), Multilingual Education Yearbook 2023 (pp. 49–69). Springer International. https://doi.org/10.1007/978-3-031-32811-4_4
Ranalli, J. (2018). Automated written corrective feedback: How well can students make use of it? Computer Assisted Language Learning, 31(7), 653 674. https://doi.org/10.1080/09588221.2018.1428994
Stevenson, M., & Phakiti, A. (2014). The effects of computer-generated feedback on the quality of writing. Assessing Writing, 19, 51–65. https://doi.org/10.1016/j.asw.2013.11.007
Test Statistics. (2022). IELTS. https://ielts.org/researchers/our research/test statistics#Demographic
Uysal, H.H. (2010). A critical review of the IELTS writing test. ELT Journal, 64(3), 314–320. https://doi.org/10.1093/elt/ccp026
Warschauer, M., Tseng, W., Yim, S., Webster, T., Jacob, S., Du, Q., & Tate, T. (2023). The affordances and contradictions of AI-generated text for writers of English as a second or foreign language. Journal of Second Language Writing, 62, 101071. https://doi.org/10.1016/j.jslw.2023.101071
Yang, X., & Dai, Y. (2015). An empirical study of college English autonomous writing teaching mode based on www.pigai.org. Technology Enhanced Foreign Language Education, 162(02), 17-23. (Translated from Chinese) https://doi.org/10.3969/j.issn.1001-5795.2015.02.003
Zhang, Z.V. (2020). Engaging with automated writing evaluation (AWE) feedback on L2 writing: Student perceptions and revisions. Assessing Writing, 43, 100439. https://doi.org/10.1016/j.asw.2019.100439
Zhou, Z., & Prado, M. (2024). A corpus-based comparative study of readability of passages in compulsory Chinese English textbooks and exams for middle school students. Proceedings of the 13th Int. Conf. on Educational and Information Technology. pp. 279-83. http://doi.org/10.1109/ICEIT61397.2024.10540975
Chen, X., Zhou, Z., & Prado, M. (2025). ChatGPT-3.5 as an automatic scoring system and feedback provider in IELTS exams. International Journal of Assessment Tools in Education, 12(1), 62-77. https://doi.org/10.21449/ijate.1496193