Design of an iOS Mobile Application for the Automated Evaluation of Open-Ended Exams via Artificial Intelligence and Image Processing

Nazmi Ekin Vural

doi:10.53047/josse.1691312

Research Article

Design of an iOS Mobile Application for the Automated Evaluation of Open-Ended Exams via Artificial Intelligence and Image Processing

Year 2025, Volume: 8 Issue: 1, 1 - 35, 31.05.2025

Nazmi Ekin Vural

https://doi.org/10.53047/josse.1691312

Abstract

Evaluating open-ended exams presents significant challenges in terms of time management and consistency in educational processes. This study aims to develop an iOS-based mobile application, “Exam Reader” to streamline the evaluation of handwritten open-ended exam responses by integrating visual recognition and language analysis tools, enabling educators to deliver timely and fair assessments. Developed using the Swift programming language, the application relies on two core technologies. First, handwritten student responses are converted into digital text using Optical Character Recognition (OCR) via the Google Cloud Vision API. These texts are then analyzed for clarity and coherence using the OpenAI API and GPT-4o model, ensuring that students’ ideas are presented in a structured, accessible format for evaluation. Finally, the evaluation results and related data are provided to users in PDF format. Designed with a user-friendly interface, the application allows educators to quickly interpret responses and align them with expected learning outcomes through integrated language and image analysis tools. This system offers an innovative model for digitizing, standardizing, and automating open-ended exam evaluations, contributing to the systematic improvement of educational assessment processes. However, the application has limitations. Variations in handwriting and low-quality scans may reduce OCR accuracy, and AI-supported content analysis risks missing contextual nuances. Additionally, the system requires a stable internet connection, limiting offline functionality. Future enhancements, including advanced OCR models, multilingual support, and an offline mode, are planned to address these issues. The application developed in this direction is expected to make a significant contribution to the digitalization of educational assessment and to adapt to next-generation technologies.

Keywords

Image Processing, Handwriting Recognition, AI, Cloud Vision API, OpenAI API, iOS

References

Ahmed, F., Hina, R., & Asif, M. (2021). Evaluation of descriptive answers of open-ended questions using NLP techniques. Turkish Journal of Computer and Mathematics Education, 12(10), 4887–4896. https://doi.org/10.1109/ICCIS54243.2021.9676405
Ariely, N., Nazaretsky, T., & Alexandron, G. (2021). Machine learning and Hebrew NLP for automated assessment of open-ended questions in biology. https://doi.org/10.1007/s40593-021-00283-x
Ariely, M., Nazaretsky, T., & Alexandron, G. (2020). First steps towards NLP-based formative feedback to improve scientific writing in Hebrew. In A. N. Rafferty, J. Whitehill, V. Cavalli-Sforza, & C. Romero (Eds.), Proceedings of the 13th International Conference on Educational Data Mining (EDM 2020) (pp. 565–568). Retrieved from https://osf.io/preprints/edarxiv/pe5ky
Attali, Y., & Burstein, J. (2006). Automated Essay Scoring With e-rater® V.2. The Journal of Technology, Learning and Assessment, 4(3). Retrieved from https://ejournals.bc.edu/index.php/jtla/article/view/1650
Azizi, S., Mahmoudi, F., Alizadehsani, R., & Nahavandi, S. (2024). Image processing and artificial intelligence for apple detection and localization: A comprehensive review. Artificial Intelligence Review, 57(2), 123–150. https://doi.org/10.1016/j.cosrev.2024.100690
Burstein, J., Tetreault, J., & Madnani, N. (2013). The E-rater® automated essay scoring system. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 55–67). Routledge. Retrieved from https://psycnet.apa.org/record/2013-15323-004
Chen, C. F. E., & Cheng, W. Y. E. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning & Technology, 12(2), 94–112. Retrieved from https://www.lltjournal.org/item/10125-44145/
Cui, Y., & Liang, M. (2024). Automated scoring of translations with BERT models: Chinese and English language case study. Applied Sciences, 14(5), 1925. https://doi.org/10.3390/app14051925
Çeliker, N., & Gürsoy, S. (2025). Artificial intelligence in human resource management: A bibliometric analysis on trends, prospects and future research agenda. The Journal of Business Science, 13(1), 97-120. https://doi.org/10.22139/jobs.1594699
Çınar, A., Kara, A., Koç, H., & Kılıç, S. (2020). Machine learning algorithm for grading open-ended physics questions in Turkish. Education and Information Technologies, 25(5), 3821–3844. http://doi.org/10.1007/s10639-020-10128-0
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805
Gonzalez, R. C., & Woods, R. E. (2008). Digital image processing (3rd ed.). Pearson Prentice Hall. Retrieved from https://sde.uoc.ac.in/sites/default/files/sde_videos/Digital%20Image%20Processing%203rd%20ed.%20-%20R.%20Gonzalez,%20R.%20Woods-ilovepdf-compressed.pdf
Graves, A., & Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. Advances in Neural Information Processing Systems, 21, 545–552. Retrieved from https://www.cs.toronto.edu/~graves/nips_2008.pdf
Hidayat, E. Y., Hastuti, K., & Muda, A. K. (2025). Artificial intelligence in digital image processing: A bibliometric analysis. Intelligent Systems with Applications, 25, 200466. https://doi.org/10.1016/j.iswa.2024.200466
Hussain, S., Dixit, P., & Hussain, M. S. (2020). Image processing in artificial intelligence. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 6(5), 244–249. https://doi.org/10.32628/CSEIT206542
Karadağ, N. (2023). The impact of artificial intelligence on online assessment: A preliminary review. Journal of Educational Technology & Online Learning, 6(4), 822-837. https://doi.org/10.31681/jetol.1351548
Ke, Z., & Ng, V. (2019). Automated essay scoring: A survey of the state of the art. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) (pp. 6300–6308). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/879
Keser Ateş, S., Kaleci, F., & Erdoğan, A. (2025). Artificial intelligence in education: A bibliometric analysis. Ahmet Keleşoğlu Faculty of Education Journal (AKEF), 7(1), 14–36. https://doi.org/10.38151/akef.2025.147
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
Lepakshi, V. A. (2022). Machine learning and deep learning based AI tools for development of diagnostic tools. In A. Parihar, R. Khan, A. Kumar, A. K. Kaushik, & H. Gohel (Eds.), Computational approaches for novel therapeutic and diagnostic designing to mitigate SARS-CoV-2 infection (pp. 399–420). Academic Press. https://doi.org/10.1016/B978-0-323-91172-6.00011-X
Norman, D. A. (2013). The design of everyday things (Rev. & expanded ed.). Basic Books. https://jnd.org/books/the-design-of-everyday-things-revised-and-expanded-edition/
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI. Retrieved from https://openai.com/research/language-unsupervised
Ramesh, D., & Sanampudi, S. K. (2022). An automated essay scoring systems: A systematic literature review. Artificial Intelligence Review, 55, 2495-2527. https://doi.org/10.1007/s10462-021-10068-2
Shermis, M. D. (2015). Contrasting state-of-the-art in the machine scoring of short-form constructed responses. Educational Assessment, 20(1), 46–65. https://doi.org/10.1080/10627197.2015.997617
Shermis, M. D., Burstein, J., & Apel Bursky, S. (2013). Introduction to automated essay evaluation (AES) NLP. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 1–15). Routledge. https://doi.org/10.4324/9780203122761
Schultz, M. T. (2013). The IntelliMetric™ automated essay scoring engine – A review and an application to Chinese essay scoring. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 89–98). Routledge. https://doi.org/10.4324/9780203122761
Shah, F. T., & Yousaf, K. (2007). Handwritten digit recognition using image processing and neural networks. Proceedings of the World Congress on Engineering 2007, Vol I (WCE 2007). Retrieved from https://www.iaeng.org/publication/WCE2007/WCE2007_pp648-651.pdf
Smith, R. (2007). An overview of the Tesseract OCR engine. Proceedings of the Ninth International Conference on Document Analysis and Recognition, 629–633. https://doi.org/10.1109/ICDAR.2007.4376991
Tanberkan, H., Özer, M., & Gelbal, S. (2024). Impact of artificial intelligence on assessment and evaluation approaches in education. International Journal of Educational Studies and Policy, 5(2), 139–152. Retrieved from https://dergipark.org.tr/en/pub/ijesp/issue/90802/1659826
Turhan, S., Bozkurt, M., & Şahin, D. Ö. (2023). Development of mobile application based optical mark reader system using image processing techniques. KMÜ Mühendislik ve Doğa Bilimleri Dergisi, 5(2), 169–190. https://doi.org/10.55213/kmujens.1386520
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser L. & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008. https://arxiv.org/abs/1706.03762
Wang, S., Wang, F., Zhu, Z., Wang, J., Tran, T., & Du, Z. (2024). Artificial intelligence in education: A systematic literature review. Expert Systems with Applications, 252, 124167. https://doi.org/10.1016/j.eswa.2024.124167
Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies: An International Journal, 3(1), 22–36. https://doi.org/10.1080/15544800701771580
Wiser, M. J., Anderson, C. W., & Sadler, T. D. (2016). Comparing human and automated evaluation of open-ended student responses to questions of evolution. Research in Science Education, 46(6), 841–867. https://doi.org/10.1162/978-0-262-33936-0-ch025
Zhang, M., Press, O., Merrill, W., Liu, A., & Smith, N. A. (2023). How language model hallucinations can snowball. https://arxiv.org/abs/2305.13534

Açık Uçlu Sınavların Yapay Zeka ve Görüntü İşleme Aracılığıyla Otomatik Değerlendirilmesi için iOS Mobil Uygulama Tasarımı

Year 2025, Volume: 8 Issue: 1, 1 - 35, 31.05.2025

Nazmi Ekin Vural

https://doi.org/10.53047/josse.1691312

Abstract

Açık uçlu sınavların değerlendirilmesi, eğitim süreçlerinde zaman yönetimi ve değerlendirme tutarlılığı açısından önemli zorluklar yaratır. Bu çalışma, el yazısı ile yazılan açık uçlu sınav yanıtlarının değerlendirilmesini kolaylaştırmak için iOS tabanlı bir mobil uygulama geliştirmeyi amaçlar. “Sınav Okuma” adlı uygulama, görsel tanıma araçlarını dil analiziyle birleştirerek eğitimcilerin zamanında ve daha adil değerlendirmeler yapmasını amaçlar. Swift programlama diliyle geliştirilen uygulama, iki ana teknolojik bileşene dayanır. İlk olarak, öğrencilerin el yazısı yanıtları, Google Cloud Vision API aracılığıyla Optik Karakter Tanıma (OCR) teknolojisi kullanılarak dijital metne dönüştürülür. Bu metinler, OpenAI API ve GPT-4o modeli ile netlik ve tutarlılık açısından incelenir, böylece öğrencilerin fikirleri daha yapılandırılmış ve değerlendirme için erişilebilir hale gelir. Son olarak, değerlendirme sonuçları ve ilgili veriler, kullanıcıya PDF formatında sunulur.

Kullanıcı dostu bir arayüze sahip olan uygulama, eğitimcilerin öğrenci yanıtlarını hızlıca yorumlamasını ve entegre dil ve görüntü analiz araçlarıyla beklenen öğrenme çıktılarına eşleştirmesini sağlar. Bu sistem, açık uçlu sınav değerlendirmesinin dijitalleştirilmesi, standartlaştırılması ve otomasyonu için yenilikçi bir model sunarak eğitimde ölçme ve değerlendirme süreçlerinin sistematik olarak iyileştirilmesine katkıda bulunur. Ancak, uygulamanın bazı sınırlamaları vardır. El yazısı tanıma farklılıkları ve düşük kaliteli taramalar OCR doğruluğunu etkileyebilir; yapay zeka destekli içerik analizinde bağlamın tam anlaşılmaması riski bulunabilir. Ayrıca, sistem tam performans için kesintisiz internet bağlantısı gerektirir, bu da çevrimdışı kullanım senaryolarını sınırlar. Bu sınırlamalar göz önüne alındığında, daha gelişmiş OCR modellerinin entegrasyonu, çok dilli destek ve çevrimdışı mod geliştirilmesi gibi gelecekteki iyileştirmeler hedeflenir. Bu doğrultuda geliştirilen uygulamanın, eğitimde ölçme ve değerlendirme alanında dijitalleşmeyi destekleyen ve yeni nesil teknolojilere uyum sağlayan önemli bir katkı sunacağı düşünülmektedir.

Keywords

Görüntü İşleme, El Yazısı Tanıma, Yapay Zekâ, Cloud Vision API, OpenAI API, iOS

References

Ahmed, F., Hina, R., & Asif, M. (2021). Evaluation of descriptive answers of open-ended questions using NLP techniques. Turkish Journal of Computer and Mathematics Education, 12(10), 4887–4896. https://doi.org/10.1109/ICCIS54243.2021.9676405
Ariely, N., Nazaretsky, T., & Alexandron, G. (2021). Machine learning and Hebrew NLP for automated assessment of open-ended questions in biology. https://doi.org/10.1007/s40593-021-00283-x
Ariely, M., Nazaretsky, T., & Alexandron, G. (2020). First steps towards NLP-based formative feedback to improve scientific writing in Hebrew. In A. N. Rafferty, J. Whitehill, V. Cavalli-Sforza, & C. Romero (Eds.), Proceedings of the 13th International Conference on Educational Data Mining (EDM 2020) (pp. 565–568). Retrieved from https://osf.io/preprints/edarxiv/pe5ky
Attali, Y., & Burstein, J. (2006). Automated Essay Scoring With e-rater® V.2. The Journal of Technology, Learning and Assessment, 4(3). Retrieved from https://ejournals.bc.edu/index.php/jtla/article/view/1650
Azizi, S., Mahmoudi, F., Alizadehsani, R., & Nahavandi, S. (2024). Image processing and artificial intelligence for apple detection and localization: A comprehensive review. Artificial Intelligence Review, 57(2), 123–150. https://doi.org/10.1016/j.cosrev.2024.100690
Burstein, J., Tetreault, J., & Madnani, N. (2013). The E-rater® automated essay scoring system. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 55–67). Routledge. Retrieved from https://psycnet.apa.org/record/2013-15323-004
Chen, C. F. E., & Cheng, W. Y. E. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning & Technology, 12(2), 94–112. Retrieved from https://www.lltjournal.org/item/10125-44145/
Cui, Y., & Liang, M. (2024). Automated scoring of translations with BERT models: Chinese and English language case study. Applied Sciences, 14(5), 1925. https://doi.org/10.3390/app14051925
Çeliker, N., & Gürsoy, S. (2025). Artificial intelligence in human resource management: A bibliometric analysis on trends, prospects and future research agenda. The Journal of Business Science, 13(1), 97-120. https://doi.org/10.22139/jobs.1594699
Çınar, A., Kara, A., Koç, H., & Kılıç, S. (2020). Machine learning algorithm for grading open-ended physics questions in Turkish. Education and Information Technologies, 25(5), 3821–3844. http://doi.org/10.1007/s10639-020-10128-0
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805. https://arxiv.org/abs/1810.04805
Gonzalez, R. C., & Woods, R. E. (2008). Digital image processing (3rd ed.). Pearson Prentice Hall. Retrieved from https://sde.uoc.ac.in/sites/default/files/sde_videos/Digital%20Image%20Processing%203rd%20ed.%20-%20R.%20Gonzalez,%20R.%20Woods-ilovepdf-compressed.pdf
Graves, A., & Schmidhuber, J. (2009). Offline handwriting recognition with multidimensional recurrent neural networks. Advances in Neural Information Processing Systems, 21, 545–552. Retrieved from https://www.cs.toronto.edu/~graves/nips_2008.pdf
Hidayat, E. Y., Hastuti, K., & Muda, A. K. (2025). Artificial intelligence in digital image processing: A bibliometric analysis. Intelligent Systems with Applications, 25, 200466. https://doi.org/10.1016/j.iswa.2024.200466
Hussain, S., Dixit, P., & Hussain, M. S. (2020). Image processing in artificial intelligence. International Journal of Scientific Research in Computer Science, Engineering and Information Technology, 6(5), 244–249. https://doi.org/10.32628/CSEIT206542
Karadağ, N. (2023). The impact of artificial intelligence on online assessment: A preliminary review. Journal of Educational Technology & Online Learning, 6(4), 822-837. https://doi.org/10.31681/jetol.1351548
Ke, Z., & Ng, V. (2019). Automated essay scoring: A survey of the state of the art. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19) (pp. 6300–6308). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2019/879
Keser Ateş, S., Kaleci, F., & Erdoğan, A. (2025). Artificial intelligence in education: A bibliometric analysis. Ahmet Keleşoğlu Faculty of Education Journal (AKEF), 7(1), 14–36. https://doi.org/10.38151/akef.2025.147
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436–444. https://doi.org/10.1038/nature14539
Lepakshi, V. A. (2022). Machine learning and deep learning based AI tools for development of diagnostic tools. In A. Parihar, R. Khan, A. Kumar, A. K. Kaushik, & H. Gohel (Eds.), Computational approaches for novel therapeutic and diagnostic designing to mitigate SARS-CoV-2 infection (pp. 399–420). Academic Press. https://doi.org/10.1016/B978-0-323-91172-6.00011-X
Norman, D. A. (2013). The design of everyday things (Rev. & expanded ed.). Basic Books. https://jnd.org/books/the-design-of-everyday-things-revised-and-expanded-edition/
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI. Retrieved from https://openai.com/research/language-unsupervised
Ramesh, D., & Sanampudi, S. K. (2022). An automated essay scoring systems: A systematic literature review. Artificial Intelligence Review, 55, 2495-2527. https://doi.org/10.1007/s10462-021-10068-2
Shermis, M. D. (2015). Contrasting state-of-the-art in the machine scoring of short-form constructed responses. Educational Assessment, 20(1), 46–65. https://doi.org/10.1080/10627197.2015.997617
Shermis, M. D., Burstein, J., & Apel Bursky, S. (2013). Introduction to automated essay evaluation (AES) NLP. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 1–15). Routledge. https://doi.org/10.4324/9780203122761
Schultz, M. T. (2013). The IntelliMetric™ automated essay scoring engine – A review and an application to Chinese essay scoring. In M. D. Shermis & J. Burstein (Eds.), Handbook of automated essay evaluation: Current applications and new directions (pp. 89–98). Routledge. https://doi.org/10.4324/9780203122761
Shah, F. T., & Yousaf, K. (2007). Handwritten digit recognition using image processing and neural networks. Proceedings of the World Congress on Engineering 2007, Vol I (WCE 2007). Retrieved from https://www.iaeng.org/publication/WCE2007/WCE2007_pp648-651.pdf
Smith, R. (2007). An overview of the Tesseract OCR engine. Proceedings of the Ninth International Conference on Document Analysis and Recognition, 629–633. https://doi.org/10.1109/ICDAR.2007.4376991
Tanberkan, H., Özer, M., & Gelbal, S. (2024). Impact of artificial intelligence on assessment and evaluation approaches in education. International Journal of Educational Studies and Policy, 5(2), 139–152. Retrieved from https://dergipark.org.tr/en/pub/ijesp/issue/90802/1659826
Turhan, S., Bozkurt, M., & Şahin, D. Ö. (2023). Development of mobile application based optical mark reader system using image processing techniques. KMÜ Mühendislik ve Doğa Bilimleri Dergisi, 5(2), 169–190. https://doi.org/10.55213/kmujens.1386520
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser L. & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998–6008. https://arxiv.org/abs/1706.03762
Wang, S., Wang, F., Zhu, Z., Wang, J., Tran, T., & Du, Z. (2024). Artificial intelligence in education: A systematic literature review. Expert Systems with Applications, 252, 124167. https://doi.org/10.1016/j.eswa.2024.124167
Warschauer, M., & Grimes, D. (2008). Automated writing assessment in the classroom. Pedagogies: An International Journal, 3(1), 22–36. https://doi.org/10.1080/15544800701771580
Wiser, M. J., Anderson, C. W., & Sadler, T. D. (2016). Comparing human and automated evaluation of open-ended student responses to questions of evolution. Research in Science Education, 46(6), 841–867. https://doi.org/10.1162/978-0-262-33936-0-ch025
Zhang, M., Press, O., Merrill, W., Liu, A., & Smith, N. A. (2023). How language model hallucinations can snowball. https://arxiv.org/abs/2305.13534

There are 35 citations in total.

Details

Primary Language	English
Subjects	Educational Technology and Computing
Journal Section	Articles
Authors	Nazmi Ekin Vural 0000-0003-4198-0407
Publication Date	May 31, 2025
Submission Date	May 4, 2025
Acceptance Date	May 31, 2025
Published in Issue	Year 2025 Volume: 8 Issue: 1

Cite

APA	Vural, N. E. (2025). Design of an iOS Mobile Application for the Automated Evaluation of Open-Ended Exams via Artificial Intelligence and Image Processing. Journal of Social Sciences And Education, 8(1), 1-35. https://doi.org/10.53047/josse.1691312

Download Cover Image

Article Files

Full Text