Research Article

Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis

Volume: 11 Number: 2 August 31, 2025

Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis

Abstract

This study examines the security performance of generative artificial intelligence (AI) tools of ChatGPT, Copilot, and Gemini within software development workflows. Through static and dynamic code analysis, security vulnerabilities in web application login code generated by these tools were systematically evaluated. Results indicate that while AI models offer efficiency in code generation, they also introduce varying levels of security risk. Copilot exhibited the highest cumulative risk with multiple high-level vulnerabilities, while ChatGPT demonstrated a lower risk profile. Gemini produced relatively optimized code but contained critical security flaws that require manual review. The most common vulnerabilities across all models were insecure design and security logging and monitoring failures, indicating a systemic issue in AI-generated code. The findings emphasize that generic prompts focusing on security are insufficient and that developers must use specific, security-oriented prompts, such as applying secure-by-design principles and implementing OWASP Top Ten protections. This study contributes to the growing body of literature addressing the security implications of integrating AI into software development, highlighting the importance of human oversight and carefully crafted prompts to mitigate potential risks.

Keywords

References

  1. [1] S. Feuerriegel, J. Hartmann, C. Janiesch, and P. Zschech, “Generative AI,” Bus Inf Syst Eng, vol. 66, no. 1, pp. 111–126, Feb. 2024. doi: 10.1007/s12599-023-00834-7
  2. [2] L. Banh and G. Strobel, “Generative artificial intelligence,” Electron Markets, vol. 33, no. 1, p. 63, Dec. 2023. doi: 10.1007/s12525-023-00680-1
  3. [3] P. Kokol, “The Use of AI in Software Engineering: A Synthetic Knowledge Synthesis of the Recent Research Literature,” Information, vol. 15, no. 6, p. 354, Jun. 2024. doi: 10.3390/info15060354
  4. [4] Y. Almeida et al., “AICodeReview: Advancing code quality with AI-enhanced reviews,” SoftwareX, vol. 26, p. 101677, May 2024. doi: 10.1016/j.softx.2024.101677
  5. [5] P. Vaithilingam, T. Zhang, and E. L. Glassman, “Expectation vs. Experience: Evaluating the Usability of Code Generation Tools Powered by Large Language Models,” in CHI Conference on Human Factors in Computing Systems Extended Abstracts, New Orleans LA USA: ACM, Apr. 2022, pp. 1–7. doi: 10.1145/3491101.3519665
  6. [6] R. Wang, R. Cheng, D. Ford, and T. Zimmermann, “Investigating and Designing for Trust in AI-powered Code Generation Tools,” in The 2024 ACM Conference on Fairness, Accountability, and Transparency, Rio de Janeiro Brazil: ACM, Jun. 2024, pp. 1475–1493. doi: 10.1145/3630106.3658984
  7. [7] D. Hanson, “Future of Code with Generative AI: Transparency and Safety in the Era of AI Generated Software,” 2025, arXiv. doi: 10.48550/ARXIV.2505.20303
  8. [8] M. Taeb, H. Chi, and S. Bernadin, “Assessing the Effectiveness and Security Implications of AI Code Generators,” CISSE, vol. 11, no. 1, p. 6, Feb. 2024. doi: 10.53735/cisse.v11i1.180

Details

Primary Language

English

Subjects

Computer Software

Journal Section

Research Article

Publication Date

August 31, 2025

Submission Date

June 16, 2025

Acceptance Date

August 9, 2025

Published in Issue

Year 2025 Volume: 11 Number: 2

APA
Ceran, O. (2025). Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis. Gazi Journal of Engineering Sciences, 11(2), 304-320. https://izlik.org/JA39ZF75PA
AMA
1.Ceran O. Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis. GJES. 2025;11(2):304-320. https://izlik.org/JA39ZF75PA
Chicago
Ceran, Onur. 2025. “Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis”. Gazi Journal of Engineering Sciences 11 (2): 304-20. https://izlik.org/JA39ZF75PA.
EndNote
Ceran O (August 1, 2025) Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis. Gazi Journal of Engineering Sciences 11 2 304–320.
IEEE
[1]O. Ceran, “Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis”, GJES, vol. 11, no. 2, pp. 304–320, Aug. 2025, [Online]. Available: https://izlik.org/JA39ZF75PA
ISNAD
Ceran, Onur. “Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis”. Gazi Journal of Engineering Sciences 11/2 (August 1, 2025): 304-320. https://izlik.org/JA39ZF75PA.
JAMA
1.Ceran O. Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis. GJES. 2025;11:304–320.
MLA
Ceran, Onur. “Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis”. Gazi Journal of Engineering Sciences, vol. 11, no. 2, Aug. 2025, pp. 304-20, https://izlik.org/JA39ZF75PA.
Vancouver
1.Onur Ceran. Security Evaluation of AI-Generated Code: A Comparative Study of ChatGPT, Copilot, And Gemini through Static and Dynamic Analysis. GJES [Internet]. 2025 Aug. 1;11(2):304-20. Available from: https://izlik.org/JA39ZF75PA

Gazi Journal of Engineering Sciences (GJES) publishes open access articles under a Creative Commons Attribution 4.0 International License (CC BY 4.0)  1366_2000-copia-2.jpg