GDD Generation for Hyper-Casual Games Using Large Language Models: A Comparative Evaluation

Muhammet Emin Aydinalp; Buket Doğan; Abdullah Bal

doi:10.17798/bitlisfen.1664312

Research Article

GDD Generation for Hyper-Casual Games Using Large Language Models: A Comparative Evaluation

Year 2025, Volume: 14 Issue: 3, 1469 - 1486, 30.09.2025

Muhammet Emin Aydinalp , Buket Doğan , Abdullah Bal

https://doi.org/10.17798/bitlisfen.1664312

Abstract

Game Design Documentation (GDD) is a critical document that includes the design and mechanical details of the game to be developed. These documents create a common understanding among team members by including details such as the game's progress, story, and design features. In order for the game development process to proceed and be completed healthily, these documents must be prepared in a high-quality, clear, and detailed manner. However, the creation of this documentation is a time-consuming and error-prone process. Especially in game genres that require rapid prototyping, incomplete or insufficient GDDs can cause delays in the project process. This study was conducted to examine the effectiveness of LLMs in GDD production. The hyper-casual game Pool Wars was selected as a reference, and for this example game, the GDD created by a human expert and the GDD produced by ChatGPT-4 using various prompt methods were evaluated by four experts in the field according to eight different criteria using a five-point Likert scale. In addition to structural and creative aspects, visual elements were also included in the evaluation process. ImageFX, developed by Google, was used to add visual content to the GDD created by ChatGPT-4. As a result, it was seen that LLMs were more successful in many criteria in GDD production. As a result of the scoring made by an academician and three experts from the sector, GDD created by LLM received an overall average score of 4.71 out of 5, while GDD prepared by human expert received 3.29 points. GDD produced by LLM showed a clear superiority especially in terms of understandability and level of detail. However, it showed a similar performance to human expert in terms of creativity and visual content and it was observed that there was room for improvement in these areas.

Keywords

Large language models (LLM) , Game design documentation (GDD) , ChatGPT-4 , Hyper-Casual games , Prompt engineering , ImageFX.

Ethical Statement

The study is complied with research and publication ethics.

References

M. G. Salazar, H. A. Mitre, C. L. Olalde, and J. L. G. Sánchez, "Proposal of Game Design Document from software engineering requirements perspective," in 2012 17th International Conference on Computer Games (CGAMES), 2012: IEEE, pp. 81-85.
E. Bethke, Game development and production. Wordware Publishing, Inc., 2003.
J. Haltsonen, "Guide to Writing a Game Design Document," 2015.
C. Macklin and J. Sharp, Games, Design and Play: A detailed approach to iterative game design. Addison-Wesley Professional, 2016.
A. Charoenpruksachat and P. Longani, "Comparative study of usability evaluation methods on a hyper casual game," in 2021 Joint International Conference on Digital Arts, Media and Technology with ECTI Northern Section Conference on Electrical, Electronics, Computer and Telecommunication Engineering, 2021: IEEE, pp. 153-156.
П. Івасюк, "Game design document," 2016.
C. M. Kanode and H. M. Haddad, "Software engineering challenges in game development," in 2009 Sixth International Conference on Information Technology: New Generations, 2009: IEEE, pp. 260-265.
B. Min et al., "Recent advances in natural language processing via large pre-trained language models: A survey," ACM Computing Surveys, vol. 56, no. 2, pp. 1-40, 2023, doi: https://doi.org/10.1145/3605943.
A. Fantechi, S. Gnesi, L. Passaro, and L. Semini, "Inconsistency detection in natural language requirements using chatgpt: a preliminary evaluation," in 2023 IEEE 31st International Requirements Engineering Conference (RE), 2023: IEEE, pp. 335-340.
V. Bertram, H. Kausch, E. Kusmenko, H. Nqiri, B. Rumpe, and C. Venhoff, "Leveraging Natural Language Processing for a Consistency Checking Toolchain of Automotive Requirements," in 2023 IEEE 31st International Requirements Engineering Conference (RE), 2023: IEEE, pp. 212-222.
Y. C. Gönültaş, "Yapay Zekâ ve Bilimsel Metin Yazımı: Türk Kamu Yönetimi Alanyazınında ChatGPT4. 0 Örneği," Uluslararası Yönetim Akademisi Dergisi, vol. 7, no. 3, pp. 827-843, 2024, doi: https://doi.org/10.33712/mana.1578165.
A. Goslen, Y. J. Kim, J. Rowe, and J. Lester, "Llm-based student plan generation for adaptive scaffolding in game-based learning environments," International Journal of Artificial Intelligence in Education, pp. 1-26, 2024, doi: https://doi.org/10.1007/s40593-024-00421-1.
M. C. Laupichler, J. F. Rother, I. C. G. Kadow, S. Ahmadi, and T. Raupach, "Large language models in medical education: comparing ChatGPT-to human-generated exam questions," Academic Medicine, vol. 99, no. 5, pp. 508-512, 2024, doi: 10.1097/ACM.0000000000005626.
D. Luitel, S. Hassani, and M. Sabetzadeh, "Using language models for enhancing the completeness of natural-language requirements," in International working conference on requirements engineering: foundation for software quality, 2023: Springer, pp. 87-104.
G. Agrawal, K. Pal, Y. Deng, H. Liu, and Y.-C. Chen, "CyberQ: Generating Questions and Answers for Cybersecurity Education Using Knowledge Graph-Augmented LLMs," in Proceedings of the AAAI Conference on Artificial Intelligence, 2024, vol. 38, no. 21, pp. 23164-23172.
J. Pereira, J.-M. López, X. Garmendia, and M. Azanza, "Leveraging Open Source LLMs for Software Engineering Education and Training," in 2024 36th International Conference on Software Engineering Education and Training (CSEE&T), 2024: IEEE, pp. 1-10.
M. Krishna, B. Gaur, A. Verma, and P. Jalote, "Using LLMs in Software Requirements Specifications: An Empirical Evaluation," arXiv preprint arXiv:2404.17842, 2024.
S. Lubos et al., "Leveraging llms for the quality assurance of software requirements," in 2024 IEEE 32nd International Requirements Engineering Conference (RE), 2024: IEEE, pp. 389-397.
J. Lee, W. Jung, and S. Baek, "In-house knowledge management using a large language model: focusing on technical specification documents review," Applied Sciences, vol. 14, no. 5, p. 2096, 2024.
Y. Ma, Z. Liu, and O. Kalinli, "Effective Text Adaptation For LLM-Based ASR Through Soft Prompt Fine-Tuning," in 2024 IEEE Spoken Language Technology Workshop (SLT), 2024: IEEE, pp. 64-69, doi: 10.1109/SLT61566.2024.10832227.
D. Raj, G. Keren, J. Jia, J. Mahadeokar, and O. Kalinli, "Faster speech-llama inference with multi-token prediction," in ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 06-11 April 2025 2025: IEEE, pp. 1-5, doi: 10.1109/ICASSP49660.2025.10890328.
L. Knoedler et al., "Pure Wisdom or Potemkin Villages? A Comparison of ChatGPT 3.5 and ChatGPT 4 on USMLE Step 3 Style Questions: Quantitative Analysis," JMIR Medical Education, vol. 10, no. 1, p. e51148, 2024, doi: 10.2196/51148.
I. Didych, "Application of neural network platforms for text-based image generation," 2024.
M. Aydinalp. "Game Design Document for 'Pool Wars' Generated by ChatGPT-4." Zenodo. https://doi.org/10.5281/zenodo.15422946 (accessed 15 May 2025.
L. Giray, "Prompt engineering with ChatGPT: a guide for academic writers," Annals of biomedical engineering, vol. 51, no. 12, pp. 2629-2633, 2023, doi: https://doi.org/10.1007/s10439-023-03272-4.
G. Z. Higginbotham and N. S. Matthews, "Prompting and in-context learning: Optimizing prompts for mistral large," 2024, doi: 10.21203/rs.3.rs-4430993/v1.
A. J. Spasić and D. S. Janković, "Using ChatGPT standard prompt engineering techniques in lesson preparation: role, instructions and seed-word prompts," in 2023 58th international scientific conference on information, communication and energy systems and technologies (ICEST), 2023: IEEE, pp. 47-50.
P. Denny et al., "Prompt Problems: A new programming exercise for the generative AI era," in Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1, 2024, pp. 296-302.
Z. M. Wang et al., "Rolellm: Benchmarking, eliciting, and enhancing role-playing abilities of large language models," arXiv preprint arXiv:2310.00746, 2023.
Ş. K. Gökçek and D. Akbulut, "Bağımsız Video Oyunlarının Geliştirilme Sürecinde Oyun Tasarımına Yönelik İhtiyaçların, Problemlerin, Benzerliklerin ve Farklılıkların Keşfedilmesi İçin Bir Alan Çalışması," Sanat ve Tasarım Dergisi, no. 30, pp. 187-207, 2022, doi: https://doi.org/10.18603/sanatvetasarim.1215230.
B. Dervişoğlu, "Öğrenci̇ler içi̇n Di̇ji̇tal Oyun Tasarım Dokümantasyonu Hazırlama Süreci̇nde Yapay Zeka Kullanımı," Social Sciences Studies Journal, vol. 10, no. 12, pp. 2458-2465, 2024.
A. Joshi, S. Kale, S. Chandel, and D. K. Pal, "Likert scale: Explored and explained," British journal of applied science & technology, vol. 7, no. 4, pp. 396-403, 2015, doi: 10.9734/BJAST/2015/14975.
OpenAI. "Introducing GPT-4 and its capabilities." OpenAI. https://openai.com/research/gpt-4 (accessed 3 Şubat 2025, 2024).
Google. "ImageFX: AI-powered Text-to-Image Tool." https://imgfx.ai/ (accessed.

There are 34 citations in total.

Details

Primary Language	English
Subjects	Artificial Intelligence (Other)
Journal Section	Research Article
Authors	Muhammet Emin Aydinalp 0009-0004-9423-0217 Buket Doğan 0000-0003-1062-2439 Abdullah Bal 0000-0002-4525-8254
Publication Date	September 30, 2025
Submission Date	March 25, 2025
Acceptance Date	July 8, 2025
Published in Issue	Year 2025 Volume: 14 Issue: 3

Cite

IEEE	M. E. Aydinalp, B. Doğan, and A. Bal, “GDD Generation for Hyper-Casual Games Using Large Language Models: A Comparative Evaluation”, Bitlis Eren Üniversitesi Fen Bilimleri Dergisi, vol. 14, no. 3, pp. 1469–1486, 2025, doi: 10.17798/bitlisfen.1664312.

Download Cover Image

Article Files

Full Text

Bitlis Eren University

Journal of Science Editor

Bitlis Eren University Graduate Institute

Bes Minare Mah. Ahmet Eren Bulvari, Merkez Kampus, 13000 BITLIS

E-mail: fbe@beu.edu.tr