Research Article
BibTex RIS Cite

Design of an iOS Mobile Application for the Automated Evaluation of Open-Ended Exams via Artificial Intelligence and Image Processing

Year 2025, Volume: 8 Issue: 1, 1 - 35, 31.05.2025
https://doi.org/10.53047/josse.1691312

Abstract

Evaluating open-ended exams presents significant challenges in terms of time management and consistency in educational processes. This study aims to develop an iOS-based mobile application, “Exam Reader” to streamline the evaluation of handwritten open-ended exam responses by integrating visual recognition and language analysis tools, enabling educators to deliver timely and fair assessments. Developed using the Swift programming language, the application relies on two core technologies. First, handwritten student responses are converted into digital text using Optical Character Recognition (OCR) via the Google Cloud Vision API. These texts are then analyzed for clarity and coherence using the OpenAI API and GPT-4o model, ensuring that students’ ideas are presented in a structured, accessible format for evaluation. Finally, the evaluation results and related data are provided to users in PDF format. Designed with a user-friendly interface, the application allows educators to quickly interpret responses and align them with expected learning outcomes through integrated language and image analysis tools. This system offers an innovative model for digitizing, standardizing, and automating open-ended exam evaluations, contributing to the systematic improvement of educational assessment processes. However, the application has limitations. Variations in handwriting and low-quality scans may reduce OCR accuracy, and AI-supported content analysis risks missing contextual nuances. Additionally, the system requires a stable internet connection, limiting offline functionality. Future enhancements, including advanced OCR models, multilingual support, and an offline mode, are planned to address these issues. The application developed in this direction is expected to make a significant contribution to the digitalization of educational assessment and to adapt to next-generation technologies.

References

  • Ahmed, F., Hina, R., & Asif, M. (2021). Evaluation of descriptive answers of open-ended questions using NLP techniques. Turkish Journal of Computer and Mathematics Education, 12(10), 4887–4896. https://doi.org/10.1109/ICCIS54243.2021.9676405
  • Ariely, N., Nazaretsky, T., & Alexandron, G. (2021). Machine learning and Hebrew NLP for automated assessment of open-ended questions in biology. https://doi.org/10.1007/s40593-021-00283-x
  • Ariely, M., Nazaretsky, T., & Alexandron, G. (2020). First steps towards NLP-based formative feedback to improve scientific writing in Hebrew. In A. N. Rafferty, J. Whitehill, V. Cavalli-Sforza, & C. Romero (Eds.), Proceedings of the 13th International Conference on Educational Data Mining (EDM 2020) (pp. 565–568). Retrieved from https://osf.io/preprints/edarxiv/pe5ky
  • Attali, Y., & Burstein, J. (2006). Automated Essay Scoring With e-rater® V.2. The Journal of Technology, Learning and Assessment, 4(3). Retrieved from https://ejournals.bc.edu/index.php/jtla/article/view/1650
  • Azizi, S., Mahmoudi, F., Alizadehsani, R., & Nahavandi, S. (2024). Image processing and artificial intelligence for apple detection and localization: A comprehensive review. Artificial Intelligence Review, 57(2), 123–150. https://doi.org/10.1016/j.cosrev.2024.100690
  • Burstein, J., Tetreault, J., & Madnani, N. (2013). The E-rater® automated essay scoring system. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 55–67). Routledge. Retrieved from https://psycnet.apa.org/record/2013-15323-004
  • Chen, C. F. E., & Cheng, W. Y. E. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning & Technology, 12(2), 94–112. Retrieved from https://www.lltjournal.org/item/10125-44145/
  • Cui, Y., & Liang, M. (2024). Automated scoring of translations with BERT models: Chinese and English language case study. Applied Sciences, 14(5), 1925. https://doi.org/10.3390/app14051925
  • Çeliker, N., & Gürsoy, S. (2025). Artificial intelligence in human resource management: A bibliometric analysis on trends, prospects and future research agenda. The Journal of Business Science, 13(1), 97-120. https://doi.org/10.22139/jobs.1594699
  • Çınar, A., Kara, A., Koç, H., & Kılıç, S. (2020). Machine learning algorithm for grading open-ended physics questions in Turkish. Education and Information Technologies, 25(5), 3821–3844. http://doi.org/10.1007/s10639-020-10128-0
  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805
  • Gonzalez, R. C., & Woods, R. E. (2008). Digital image processing (3rd ed.). Pearson Prentice Hall. Retrieved from https://sde.uoc.ac.in/sites/default/files/sde_videos/Digital%20Image%20Processing%203rd%20ed.%20-%20R.%20Gonzalez,%20R.%20Woods-ilovepdf-compressed.pdf
  • Graves, A., & Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. Advances in Neural Information Processing Systems, 21, 545–552. Retrieved from https://www.cs.toronto.edu/~graves/nips_2008.pdf
  • Hidayat, E. Y., Hastuti, K., & Muda, A. K. (2025). Artificial intelligence in digital image processing: A bibliometric analysis. Intelligent Systems with Applications, 25, 200466. https://doi.org/10.1016/j.iswa.2024.200466
  • Hussain, S., Dixit, P., & Hussain, M. S. (2020). Image processing in artificial intelligence. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 6(5), 244–249. https://doi.org/10.32628/CSEIT206542
  • Karadağ, N. (2023). The impact of artificial intelligence on online assessment: A preliminary review. Journal of Educational Technology & Online Learning, 6(4), 822-837. https://doi.org/10.31681/jetol.1351548
  • Ke, Z., & Ng, V. (2019). Automated essay scoring: A survey of the state of the art. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) (pp. 6300–6308). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/879
  • Keser Ateş, S., Kaleci, F., & Erdoğan, A. (2025). Artificial intelligence in education: A bibliometric analysis. Ahmet Keleşoğlu Faculty of Education Journal (AKEF), 7(1), 14–36. https://doi.org/10.38151/akef.2025.147
  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
  • Lepakshi, V. A. (2022). Machine learning and deep learning based AI tools for development of diagnostic tools. In A. Parihar, R. Khan, A. Kumar, A. K. Kaushik, & H. Gohel (Eds.), Computational approaches for novel therapeutic and diagnostic designing to mitigate SARS-CoV-2 infection (pp. 399–420). Academic Press. https://doi.org/10.1016/B978-0-323-91172-6.00011-X
  • Norman, D. A. (2013). The design of everyday things (Rev. & expanded ed.). Basic Books. https://jnd.org/books/the-design-of-everyday-things-revised-and-expanded-edition/
  • Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI. Retrieved from https://openai.com/research/language-unsupervised
  • Ramesh, D., & Sanampudi, S. K. (2022). An automated essay scoring systems: A systematic literature review. Artificial Intelligence Review, 55, 2495-2527. https://doi.org/10.1007/s10462-021-10068-2
  • Shermis, M. D. (2015). Contrasting state-of-the-art in the machine scoring of short-form constructed responses. Educational Assessment, 20(1), 46–65. https://doi.org/10.1080/10627197.2015.997617
  • Shermis, M. D., Burstein, J., & Apel Bursky, S. (2013). Introduction to automated essay evaluation (AES) NLP. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 1–15). Routledge. https://doi.org/10.4324/9780203122761
  • Schultz, M. T. (2013). The IntelliMetric™ automated essay scoring engine – A review and an application to Chinese essay scoring. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 89–98). Routledge. https://doi.org/10.4324/9780203122761
  • Shah, F. T., & Yousaf, K. (2007). Handwritten digit recognition using image processing and neural networks. Proceedings of the World Congress on Engineering 2007, Vol I (WCE 2007). Retrieved from https://www.iaeng.org/publication/WCE2007/WCE2007_pp648-651.pdf
  • Smith, R. (2007). An overview of the Tesseract OCR engine. Proceedings of the Ninth International Conference on Document Analysis and Recognition, 629–633. https://doi.org/10.1109/ICDAR.2007.4376991
  • Tanberkan, H., Özer, M., & Gelbal, S. (2024). Impact of artificial intelligence on assessment and evaluation approaches in education. International Journal of Educational Studies and Policy, 5(2), 139–152. Retrieved from https://dergipark.org.tr/en/pub/ijesp/issue/90802/1659826
  • Turhan, S., Bozkurt, M., & Şahin, D. Ö. (2023). Development of mobile application based optical mark reader system using image processing techniques. KMÜ Mühendislik ve Doğa Bilimleri Dergisi, 5(2), 169–190. https://doi.org/10.55213/kmujens.1386520
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser L. & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008. https://arxiv.org/abs/1706.03762
  • Wang, S., Wang, F., Zhu, Z., Wang, J., Tran, T., & Du, Z. (2024). Artificial intelligence in education: A systematic literature review. Expert Systems with Applications, 252, 124167. https://doi.org/10.1016/j.eswa.2024.124167
  • Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies: An International Journal, 3(1), 22–36. https://doi.org/10.1080/15544800701771580
  • Wiser, M. J., Anderson, C. W., & Sadler, T. D. (2016). Comparing human and automated evaluation of open-ended student responses to questions of evolution. Research in Science Education, 46(6), 841–867. https://doi.org/10.1162/978-0-262-33936-0-ch025
  • Zhang, M., Press, O., Merrill, W., Liu, A., & Smith, N. A. (2023). How language model hallucinations can snowball. https://arxiv.org/abs/2305.13534

Açık Uçlu Sınavların Yapay Zeka ve Görüntü İşleme Aracılığıyla Otomatik Değerlendirilmesi için iOS Mobil Uygulama Tasarımı

Year 2025, Volume: 8 Issue: 1, 1 - 35, 31.05.2025
https://doi.org/10.53047/josse.1691312

Abstract

Açık uçlu sınavların değerlendirilmesi, eğitim süreçlerinde zaman yönetimi ve değerlendirme tutarlılığı açısından önemli zorluklar yaratır. Bu çalışma, el yazısı ile yazılan açık uçlu sınav yanıtlarının değerlendirilmesini kolaylaştırmak için iOS tabanlı bir mobil uygulama geliştirmeyi amaçlar. “Sınav Okuma” adlı uygulama, görsel tanıma araçlarını dil analiziyle birleştirerek eğitimcilerin zamanında ve daha adil değerlendirmeler yapmasını amaçlar. Swift programlama diliyle geliştirilen uygulama, iki ana teknolojik bileşene dayanır. İlk olarak, öğrencilerin el yazısı yanıtları, Google Cloud Vision API aracılığıyla Optik Karakter Tanıma (OCR) teknolojisi kullanılarak dijital metne dönüştürülür. Bu metinler, OpenAI API ve GPT-4o modeli ile netlik ve tutarlılık açısından incelenir, böylece öğrencilerin fikirleri daha yapılandırılmış ve değerlendirme için erişilebilir hale gelir. Son olarak, değerlendirme sonuçları ve ilgili veriler, kullanıcıya PDF formatında sunulur.

Kullanıcı dostu bir arayüze sahip olan uygulama, eğitimcilerin öğrenci yanıtlarını hızlıca yorumlamasını ve entegre dil ve görüntü analiz araçlarıyla beklenen öğrenme çıktılarına eşleştirmesini sağlar. Bu sistem, açık uçlu sınav değerlendirmesinin dijitalleştirilmesi, standartlaştırılması ve otomasyonu için yenilikçi bir model sunarak eğitimde ölçme ve değerlendirme süreçlerinin sistematik olarak iyileştirilmesine katkıda bulunur. Ancak, uygulamanın bazı sınırlamaları vardır. El yazısı tanıma farklılıkları ve düşük kaliteli taramalar OCR doğruluğunu etkileyebilir; yapay zeka destekli içerik analizinde bağlamın tam anlaşılmaması riski bulunabilir. Ayrıca, sistem tam performans için kesintisiz internet bağlantısı gerektirir, bu da çevrimdışı kullanım senaryolarını sınırlar. Bu sınırlamalar göz önüne alındığında, daha gelişmiş OCR modellerinin entegrasyonu, çok dilli destek ve çevrimdışı mod geliştirilmesi gibi gelecekteki iyileştirmeler hedeflenir. Bu doğrultuda geliştirilen uygulamanın, eğitimde ölçme ve değerlendirme alanında dijitalleşmeyi destekleyen ve yeni nesil teknolojilere uyum sağlayan önemli bir katkı sunacağı düşünülmektedir.

References

  • Ahmed, F., Hina, R., & Asif, M. (2021). Evaluation of descriptive answers of open-ended questions using NLP techniques. Turkish Journal of Computer and Mathematics Education, 12(10), 4887–4896. https://doi.org/10.1109/ICCIS54243.2021.9676405
  • Ariely, N., Nazaretsky, T., & Alexandron, G. (2021). Machine learning and Hebrew NLP for automated assessment of open-ended questions in biology. https://doi.org/10.1007/s40593-021-00283-x
  • Ariely, M., Nazaretsky, T., & Alexandron, G. (2020). First steps towards NLP-based formative feedback to improve scientific writing in Hebrew. In A. N. Rafferty, J. Whitehill, V. Cavalli-Sforza, & C. Romero (Eds.), Proceedings of the 13th International Conference on Educational Data Mining (EDM 2020) (pp. 565–568). Retrieved from https://osf.io/preprints/edarxiv/pe5ky
  • Attali, Y., & Burstein, J. (2006). Automated Essay Scoring With e-rater® V.2. The Journal of Technology, Learning and Assessment, 4(3). Retrieved from https://ejournals.bc.edu/index.php/jtla/article/view/1650
  • Azizi, S., Mahmoudi, F., Alizadehsani, R., & Nahavandi, S. (2024). Image processing and artificial intelligence for apple detection and localization: A comprehensive review. Artificial Intelligence Review, 57(2), 123–150. https://doi.org/10.1016/j.cosrev.2024.100690
  • Burstein, J., Tetreault, J., & Madnani, N. (2013). The E-rater® automated essay scoring system. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 55–67). Routledge. Retrieved from https://psycnet.apa.org/record/2013-15323-004
  • Chen, C. F. E., & Cheng, W. Y. E. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning & Technology, 12(2), 94–112. Retrieved from https://www.lltjournal.org/item/10125-44145/
  • Cui, Y., & Liang, M. (2024). Automated scoring of translations with BERT models: Chinese and English language case study. Applied Sciences, 14(5), 1925. https://doi.org/10.3390/app14051925
  • Çeliker, N., & Gürsoy, S. (2025). Artificial intelligence in human resource management: A bibliometric analysis on trends, prospects and future research agenda. The Journal of Business Science, 13(1), 97-120. https://doi.org/10.22139/jobs.1594699
  • Çınar, A., Kara, A., Koç, H., & Kılıç, S. (2020). Machine learning algorithm for grading open-ended physics questions in Turkish. Education and Information Technologies, 25(5), 3821–3844. http://doi.org/10.1007/s10639-020-10128-0
  • Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805
  • Gonzalez, R. C., & Woods, R. E. (2008). Digital image processing (3rd ed.). Pearson Prentice Hall. Retrieved from https://sde.uoc.ac.in/sites/default/files/sde_videos/Digital%20Image%20Processing%203rd%20ed.%20-%20R.%20Gonzalez,%20R.%20Woods-ilovepdf-compressed.pdf
  • Graves, A., & Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. Advances in Neural Information Processing Systems, 21, 545–552. Retrieved from https://www.cs.toronto.edu/~graves/nips_2008.pdf
  • Hidayat, E. Y., Hastuti, K., & Muda, A. K. (2025). Artificial intelligence in digital image processing: A bibliometric analysis. Intelligent Systems with Applications, 25, 200466. https://doi.org/10.1016/j.iswa.2024.200466
  • Hussain, S., Dixit, P., & Hussain, M. S. (2020). Image processing in artificial intelligence. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 6(5), 244–249. https://doi.org/10.32628/CSEIT206542
  • Karadağ, N. (2023). The impact of artificial intelligence on online assessment: A preliminary review. Journal of Educational Technology & Online Learning, 6(4), 822-837. https://doi.org/10.31681/jetol.1351548
  • Ke, Z., & Ng, V. (2019). Automated essay scoring: A survey of the state of the art. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) (pp. 6300–6308). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/879
  • Keser Ateş, S., Kaleci, F., & Erdoğan, A. (2025). Artificial intelligence in education: A bibliometric analysis. Ahmet Keleşoğlu Faculty of Education Journal (AKEF), 7(1), 14–36. https://doi.org/10.38151/akef.2025.147
  • LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
  • Lepakshi, V. A. (2022). Machine learning and deep learning based AI tools for development of diagnostic tools. In A. Parihar, R. Khan, A. Kumar, A. K. Kaushik, & H. Gohel (Eds.), Computational approaches for novel therapeutic and diagnostic designing to mitigate SARS-CoV-2 infection (pp. 399–420). Academic Press. https://doi.org/10.1016/B978-0-323-91172-6.00011-X
  • Norman, D. A. (2013). The design of everyday things (Rev. & expanded ed.). Basic Books. https://jnd.org/books/the-design-of-everyday-things-revised-and-expanded-edition/
  • Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI. Retrieved from https://openai.com/research/language-unsupervised
  • Ramesh, D., & Sanampudi, S. K. (2022). An automated essay scoring systems: A systematic literature review. Artificial Intelligence Review, 55, 2495-2527. https://doi.org/10.1007/s10462-021-10068-2
  • Shermis, M. D. (2015). Contrasting state-of-the-art in the machine scoring of short-form constructed responses. Educational Assessment, 20(1), 46–65. https://doi.org/10.1080/10627197.2015.997617
  • Shermis, M. D., Burstein, J., & Apel Bursky, S. (2013). Introduction to automated essay evaluation (AES) NLP. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 1–15). Routledge. https://doi.org/10.4324/9780203122761
  • Schultz, M. T. (2013). The IntelliMetric™ automated essay scoring engine – A review and an application to Chinese essay scoring. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 89–98). Routledge. https://doi.org/10.4324/9780203122761
  • Shah, F. T., & Yousaf, K. (2007). Handwritten digit recognition using image processing and neural networks. Proceedings of the World Congress on Engineering 2007, Vol I (WCE 2007). Retrieved from https://www.iaeng.org/publication/WCE2007/WCE2007_pp648-651.pdf
  • Smith, R. (2007). An overview of the Tesseract OCR engine. Proceedings of the Ninth International Conference on Document Analysis and Recognition, 629–633. https://doi.org/10.1109/ICDAR.2007.4376991
  • Tanberkan, H., Özer, M., & Gelbal, S. (2024). Impact of artificial intelligence on assessment and evaluation approaches in education. International Journal of Educational Studies and Policy, 5(2), 139–152. Retrieved from https://dergipark.org.tr/en/pub/ijesp/issue/90802/1659826
  • Turhan, S., Bozkurt, M., & Şahin, D. Ö. (2023). Development of mobile application based optical mark reader system using image processing techniques. KMÜ Mühendislik ve Doğa Bilimleri Dergisi, 5(2), 169–190. https://doi.org/10.55213/kmujens.1386520
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser L. & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008. https://arxiv.org/abs/1706.03762
  • Wang, S., Wang, F., Zhu, Z., Wang, J., Tran, T., & Du, Z. (2024). Artificial intelligence in education: A systematic literature review. Expert Systems with Applications, 252, 124167. https://doi.org/10.1016/j.eswa.2024.124167
  • Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies: An International Journal, 3(1), 22–36. https://doi.org/10.1080/15544800701771580
  • Wiser, M. J., Anderson, C. W., & Sadler, T. D. (2016). Comparing human and automated evaluation of open-ended student responses to questions of evolution. Research in Science Education, 46(6), 841–867. https://doi.org/10.1162/978-0-262-33936-0-ch025
  • Zhang, M., Press, O., Merrill, W., Liu, A., & Smith, N. A. (2023). How language model hallucinations can snowball. https://arxiv.org/abs/2305.13534
There are 35 citations in total.

Details

Primary Language English
Subjects Educational Technology and Computing
Journal Section Articles
Authors

Nazmi Ekin Vural 0000-0003-4198-0407

Publication Date May 31, 2025
Submission Date May 4, 2025
Acceptance Date May 31, 2025
Published in Issue Year 2025 Volume: 8 Issue: 1

Cite

APA Vural, N. E. (2025). Design of an iOS Mobile Application for the Automated Evaluation of Open-Ended Exams via Artificial Intelligence and Image Processing. Journal of Social Sciences And Education, 8(1), 1-35. https://doi.org/10.53047/josse.1691312

17387  17388  18992 18993 18997 19046 197372014220988 2099229300