Research Article

Shaping Mathematics Activities with Generative AI: Prompt Types, Models and Pedagogical Outcomes

Volume: 19 Number: 2 April 24, 2026
EN TR

Shaping Mathematics Activities with Generative AI: Prompt Types, Models and Pedagogical Outcomes

Abstract

This study investigates the relationship between prompt types and the quality of mathematics activities generated by artificial intelligence (AI) tools. Within a multiple-case study design, two advanced AI systems, ChatGPT-5 (OpenAI, September 2025) and Gemini 2.5 Pro (Google DeepMind, September, 2025), were examined using command (C) and request (R) prompts under standardised settings (temperature = 0.7, top-p = 0.9). Four activities were produced and evaluated with the Activity Evaluation and Feedback Tool, which assesses both component-level features (intended outcome, materials, instructions, responsibility, inclusivity, depth, complexity, and mathematical focus) and overall quality. The analysis revealed that three of the four AI-generated activities reached the high-quality range, with total scores of 22, 19, and 23 out of 24 points for Gemini-R, Gemini-C, and ChatGPT-C, respectively, whereas ChatGPT-R scored 15 points, indicating a medium level but close to the high threshold. ChatGPT demonstrated greater effectiveness with command prompts, whereas Gemini produced consistently high-quality outputs, performing better with request prompts. At the component level, intended outcome and materials were consistently strong, while weaknesses were observed in instructions, responsibility, and complexity, depending on the AI–prompt combination. These findings demonstrate that activity quality is shaped not only by prompt design but also by model-specific affordances. Implications are discussed for teacher education, curriculum development, and comparative research on the integration of generative AI in mathematics education.

Keywords

References

  1. Ainley, J., Pratt, D., & Hansen, A. (2006). Connecting engagement and focus in pedagogic task design. British Educational Research Journal, 32(1), 2338. https://doi.org/10.1080/01411920500401971
  2. Bozkurt, A., Özmantar M. F., Agaç, G. & Güzel, M. (2022). Matematik Öğretiminde Etkinlik Tasarımı ve Uygulamaları: Bir Değerlendirme Çerçevesi. Pegem Akademi.
  3. Chmiliar, l. (2010). Multiple-case designs. In A. J. Mills, G. Eurepas & E. Wiebe (Eds.), Encyclopedia of case study research (pp 582–583). SAGE Publications.
  4. Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104
  5. Choy, B. H. (2016). Snapshots of mathematics teacher noticing during task design. Mathematics Education Research Journal, 28(3), 421-440. https://doi.org/10.1007/s13394-016-0173-3
  6. Cooper, G. (2023). Examining science education in chatgpt: An exploratory study of generative artificial intelligence. Journal of Science Education and Technology, 32(3), 444–452. https://doi.org/10.1007/s10956-023-10039-y
  7. Correia, A. P., Hickey, S., & Xu, F. (2025). Realizing the possibilities of the large language models: Strategies for prompt engineering in educational inquiries. Theory Into Practice, 64(4), 434–447. https://doi.org/10.1080/00405841.2025.2528545
  8. Creswell, J. W. (2017). Eğitim Araştırmaları. Nicel ve Nitel Araştırmanın Planlanması Yürütülmesi ve Değerlendirilmesi. Çev. Ed. Halil Ekşi. EDAM Yayıncılık.

Details

Primary Language

English

Subjects

Educational Technology and Computing

Journal Section

Research Article

Publication Date

April 24, 2026

Submission Date

October 2, 2025

Acceptance Date

December 18, 2025

Published in Issue

Year 2026 Volume: 19 Number: 2

APA
Aydoğdu, M. Z., Çaylan Ergene, B., & Ergene, Ö. (2026). Shaping Mathematics Activities with Generative AI: Prompt Types, Models and Pedagogical Outcomes. Journal of Theoretical Educational Sciences, 19(2), 405-432. https://doi.org/10.30831/akukeg.1795773