ARKA UÇ YAZILIM PROJELERI İÇIN ÜRETKEN YAPAY ZEKA TABANLI, SÖZLEŞME ODAKLI OTOMATİK BIRIM TESTİ BAKIM YAKLAŞIMI
Yıl 2025,
Cilt: 4 Sayı: 2, 74 - 97, 31.12.2025
Furkan Samet Akıncı
,
Tuğkan Tuğlular
Öz
Güncel arka uç sistemlerinde fonksiyon sözleşmeleri ve API yüzeyleri sık sık değişmekte, bu da birim
testlerinin hızla geçerliliğini yitirmesine neden olmaktadır. Mevcut araçların büyük bölümü ilk test
üretimine odaklanmakta, test bakımının önemli bir kısmı ise hâlâ manuel yürütülmektedir. Bu çalışma,
sözleşme değişikliklerini kaynak düzeyinde algılayan ve ilişkili Jest testlerini büyük dil modelleri
(LLM) ile yeniden üreten sözleşme-tabanlı bir bakım yaklaşımı sunmaktadır. Önerilen yöntem,
TypeScript tabanlı dört açık kaynak arka uç projesinde toplam 28 sözleşme değişimi üzerinde
değerlendirilmiştir. Deneyler, aracın başarısız koşulları otomatik olarak onarabilen, kısmi
iyileştirmeler sağlayabilen ve mevcut test altyapısıyla bütünleşebilen pratik bir “self-healing” mekanizması sunduğunu göstermektedir. Bulgular, yöntemin başarısının proje mimarisi ve test suite
kalitesiyle yakından ilişkili olduğunu da ortaya koymaktadır.
Etik Beyan
Bu çalışma ile hiçbir şekilde çıkar elde edilmemiştir.
Kaynakça
-
1. Ricca, F., Marchetto, A., Stocco, A., "AI-based Test Automation: A Grey Literature Analysis", 14th IEEE Conference on Software Testing, Verification and Validation Workshops (ICSTW), Online, pp. 263-270, April 12-16, 2021.
-
2. DeMillo, R.A., Offutt, A.J., "Constraint-Based Automatic Test Data Generation", IEEE Transactions on Software Engineering, 17(9), pp. 900-910, 1991.
-
3. Fraser, G., Arcuri, A., "EvoSuite: automatic test suite generation for object-oriented software", ESEC/FSE 2011, Szeged, Hungary, pp. 416-419, September 5-9, 2011.
-
4. Broide, L., Stern, R., "EvoGPT: Enhancing Test Suite Robustness via LLM-Based Generation and Genetic Optimization", arXiv.org, https://arxiv.org/abs/2505.12424, Published on May 20, 2025, Accessed on October 2, 2025.
-
5. Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T., "Feedback-Directed Random Test Generation", 29th International Conference on Software Engineering (ICSE'07), Minneapolis, MN, USA, pp. 75-84, May 23-25, 2007.
-
6. Lukasczyk, S., Fraser, G., "Pynguin: Automated unit test generation for python", Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, Pittsburgh, PA, USA, pp. 168-172, May 22-27, 2022.
-
7. Schäfer, M., Nadi, S., Eghbali, A., Tip, F., "An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation", IEEE Transactions on Software Engineering, 50(1), pp. 85-105, 2024.
-
8. Yang, L., Yang, C., Gao, S., Wang, W., Wang, B., Zhu, Q., et al., "On the evaluation of large language models in unit test generation", Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, Sacramento, CA, USA, pp. 1607-1619, October 27 - November 1, 2024.
-
9. Shang, Y., Zhang, Q., Fang, C., Gu, S., Zhou, J., Chen, Z., "A large-scale empirical study on fine-tuning large language models for unit testing", Proceedings of the ACM on Software Engineering, 2(ISSTA), pp. 1678-1700, 2025.
-
10. Wang, Z., Liu, K., Li, G., Jin, Z., "Hits: High-coverage llm-based unit test generation via method slicing", Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, Sacramento, CA, USA, pp. 1258-1268, October 27- November 1, 2024.
-
11. Wang, W., Yang, C., Wang, Z., Huang, Y., Chu, Z., Song, D., Ma, L., "Testeval: Benchmarking large language models for test case generation", arXiv preprint arXiv:2406.04531, pp. 1-15, 2024.
-
12. Rahman, S., Kuhar, S., Cirisci, B., Garg, P., Wang, S., Ma, X., Ray, B., "UTFix: Change aware unit test repairing using LLM", Proceedings of the ACM on Programming Languages, 9(OOPSLA1), pp. 143-168, 2025.
-
13. Daniel, B., Jagannath, V., Dig, D., Marinov, D., "ReAssert: Suggesting repairs for broken unit tests", 2009 IEEE/ACM International Conference on Automated Software Engineering, Auckland, New Zealand, pp. 433-444, November 16-20, 2009.
-
14. Imtiaz, J., Iqbal, M.Z., "An automated model-based approach to repair test suites of evolving web applications", Journal of Systems and Software, 171, 110841, 2021.
-
15. Imtiaz, J., Sherin, S., Khan, M.U., Iqbal, M.Z., "A systematic literature review of test breakage prevention and repair techniques", Information and Software Technology, 113, pp. 1-19, 2019.
-
16. Atlidakis, V., Godefroid, P., Polishchuk, M., "Restler: Stateful rest api fuzzing", 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), Montreal, Canada, pp. 748-758, May 25-31, 2019.
-
17. Arcuri, A., Galeotti, J.P., Marculescu, B., Zhang, M., "Evomaster: A search-based system test generation tool", Journal of Systems and Software, 198, 111609, 2021.
-
18. Yasmin, J., Tian, Y., Yang, J., "A first look at the deprecation of RESTful APIs: An empirical study", 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), Adelaide, Australia, pp. 151-161, September 28 - October 2, 2020.
-
19. Koci, R., Franch, X., Jovanovic, P., Abelló, A., "Web API evolution patterns: A usage-driven approach", Journal of Systems and Software, 198, 111609, 2023.
-
20. Lamothe, M., Guéhéneuc, Y.G., Shang, W., "A systematic review of API evolution literature", ACM Computing Surveys (CSUR), 54(8), pp. 1-36, 2021.
-
21. Sherret, D., et al., "ts-morph — TypeScript Compiler API wrapper", GitHub repository, https://github.com/dsherret/ts-morph, Accessed on October 2, 2025.
-
22. Free Software Foundation, "Comparing and Merging Files", GNU Diffutils Manual, https://www.gnu.org/software/diffutils/manual/diffutils.html, Accessed on October 2, 2025.
-
23. The Git Project, "git-apply — Apply a patch to files and/or to the index", Git Documentation, https://www.kernel.org/pub/software/scm/git/docs/git-apply.html, Published on July 14, 2025, Accessed on October 2, 2025.
-
24. The Git Project, "git-diff — Show changes between commits, commit and working tree, etc.", Git Documentation, https://git-scm.com/docs/git-diff, Accessed on October 2, 2025.
-
25. Sorhus, S., et al., "execa — Process execution for humans", GitHub repository, https://github.com/sindresorhus/execa, Accessed on October 2, 2025.
-
26. Holowaychuk, T.J., and Commander.js contributors, "Commander.js — Node.js command-line interfaces made easy", GitHub repository, https://github.com/tj/commander.js, Accessed on October 2, 2025.
-
27. motdotla, et al., "dotenv — Loads environment variables from .env", GitHub repository, https://github.com/motdotla/dotenv, Accessed on October 2, 2025.
-
28. isaacs, et al., "glob — Match files using the patterns the shell uses", GitHub repository, https://github.com/isaacs/node-glob, Accessed on October 2, 2025.
-
29. Akıncı, F., "@furkanakinci/contract-unit-test-maintainer", npm, https://www.npmjs.com/package/@furkanakinci/contract-unit-test-maintainer, Accessed on October 2, 2025.
-
30. Brito, A., et al., "APIDiff: Detecting API breaking changes", 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), Campobasso, Italy, pp. 1-10, March 20-23, 2018.
-
31. Jezek, K., Dietrich, J., Brada, P., "How java apis break–an empirical study", Information and Software Technology, 65, pp. 129-146, 2015.
-
32. Dig, D., Johnson, R., "How do APIs evolve? A story of refactoring", Journal of software maintenance and evolution: Research and Practice, 18(2), pp. 83-107, 2006.
A CONTRACT-DRIVEN AUTOMATED UNIT TEST MAINTENANCE APPROACH WITH GENERATIVE ARTIFICIAL INTELLIGENCE FOR BACKEND SOFTWARE PROJECTS
Yıl 2025,
Cilt: 4 Sayı: 2, 74 - 97, 31.12.2025
Furkan Samet Akıncı
,
Tuğkan Tuğlular
Öz
Modern backend systems frequently undergo changes in function contracts and API surfaces, which
can quickly render unit tests outdated, brittle, or unusable. Most existing tools—whether based on
classical test generation or large language models (LLMs)—focus on initial test creation, leaving the
ongoing maintenance of existing test suites largely manual and error-prone. This paper presents a
contract-driven, AI-assisted framework for unit test maintenance in TypeScript backend projects. The
framework detects function-level contract changes and adapts related Jest tests through small,
validated edits synthesized by an LLM, without creating new tests or performing broad refactorings.
We evaluate the approach on 28 contract-change instances across four open-source projects. The
results indicate that contract-aware, LLM-based test maintenance can act as a practical self-healing
mechanism when contract changes are visible in the test surface, while its effectiveness remains
strongly shaped by project architecture and test-suite design.
Etik Beyan
The authors declare that they have no conflict of interest regarding this study.
Kaynakça
-
1. Ricca, F., Marchetto, A., Stocco, A., "AI-based Test Automation: A Grey Literature Analysis", 14th IEEE Conference on Software Testing, Verification and Validation Workshops (ICSTW), Online, pp. 263-270, April 12-16, 2021.
-
2. DeMillo, R.A., Offutt, A.J., "Constraint-Based Automatic Test Data Generation", IEEE Transactions on Software Engineering, 17(9), pp. 900-910, 1991.
-
3. Fraser, G., Arcuri, A., "EvoSuite: automatic test suite generation for object-oriented software", ESEC/FSE 2011, Szeged, Hungary, pp. 416-419, September 5-9, 2011.
-
4. Broide, L., Stern, R., "EvoGPT: Enhancing Test Suite Robustness via LLM-Based Generation and Genetic Optimization", arXiv.org, https://arxiv.org/abs/2505.12424, Published on May 20, 2025, Accessed on October 2, 2025.
-
5. Pacheco, C., Lahiri, S.K., Ernst, M.D., Ball, T., "Feedback-Directed Random Test Generation", 29th International Conference on Software Engineering (ICSE'07), Minneapolis, MN, USA, pp. 75-84, May 23-25, 2007.
-
6. Lukasczyk, S., Fraser, G., "Pynguin: Automated unit test generation for python", Proceedings of the ACM/IEEE 44th International Conference on Software Engineering: Companion Proceedings, Pittsburgh, PA, USA, pp. 168-172, May 22-27, 2022.
-
7. Schäfer, M., Nadi, S., Eghbali, A., Tip, F., "An Empirical Evaluation of Using Large Language Models for Automated Unit Test Generation", IEEE Transactions on Software Engineering, 50(1), pp. 85-105, 2024.
-
8. Yang, L., Yang, C., Gao, S., Wang, W., Wang, B., Zhu, Q., et al., "On the evaluation of large language models in unit test generation", Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, Sacramento, CA, USA, pp. 1607-1619, October 27 - November 1, 2024.
-
9. Shang, Y., Zhang, Q., Fang, C., Gu, S., Zhou, J., Chen, Z., "A large-scale empirical study on fine-tuning large language models for unit testing", Proceedings of the ACM on Software Engineering, 2(ISSTA), pp. 1678-1700, 2025.
-
10. Wang, Z., Liu, K., Li, G., Jin, Z., "Hits: High-coverage llm-based unit test generation via method slicing", Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, Sacramento, CA, USA, pp. 1258-1268, October 27- November 1, 2024.
-
11. Wang, W., Yang, C., Wang, Z., Huang, Y., Chu, Z., Song, D., Ma, L., "Testeval: Benchmarking large language models for test case generation", arXiv preprint arXiv:2406.04531, pp. 1-15, 2024.
-
12. Rahman, S., Kuhar, S., Cirisci, B., Garg, P., Wang, S., Ma, X., Ray, B., "UTFix: Change aware unit test repairing using LLM", Proceedings of the ACM on Programming Languages, 9(OOPSLA1), pp. 143-168, 2025.
-
13. Daniel, B., Jagannath, V., Dig, D., Marinov, D., "ReAssert: Suggesting repairs for broken unit tests", 2009 IEEE/ACM International Conference on Automated Software Engineering, Auckland, New Zealand, pp. 433-444, November 16-20, 2009.
-
14. Imtiaz, J., Iqbal, M.Z., "An automated model-based approach to repair test suites of evolving web applications", Journal of Systems and Software, 171, 110841, 2021.
-
15. Imtiaz, J., Sherin, S., Khan, M.U., Iqbal, M.Z., "A systematic literature review of test breakage prevention and repair techniques", Information and Software Technology, 113, pp. 1-19, 2019.
-
16. Atlidakis, V., Godefroid, P., Polishchuk, M., "Restler: Stateful rest api fuzzing", 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), Montreal, Canada, pp. 748-758, May 25-31, 2019.
-
17. Arcuri, A., Galeotti, J.P., Marculescu, B., Zhang, M., "Evomaster: A search-based system test generation tool", Journal of Systems and Software, 198, 111609, 2021.
-
18. Yasmin, J., Tian, Y., Yang, J., "A first look at the deprecation of RESTful APIs: An empirical study", 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME), Adelaide, Australia, pp. 151-161, September 28 - October 2, 2020.
-
19. Koci, R., Franch, X., Jovanovic, P., Abelló, A., "Web API evolution patterns: A usage-driven approach", Journal of Systems and Software, 198, 111609, 2023.
-
20. Lamothe, M., Guéhéneuc, Y.G., Shang, W., "A systematic review of API evolution literature", ACM Computing Surveys (CSUR), 54(8), pp. 1-36, 2021.
-
21. Sherret, D., et al., "ts-morph — TypeScript Compiler API wrapper", GitHub repository, https://github.com/dsherret/ts-morph, Accessed on October 2, 2025.
-
22. Free Software Foundation, "Comparing and Merging Files", GNU Diffutils Manual, https://www.gnu.org/software/diffutils/manual/diffutils.html, Accessed on October 2, 2025.
-
23. The Git Project, "git-apply — Apply a patch to files and/or to the index", Git Documentation, https://www.kernel.org/pub/software/scm/git/docs/git-apply.html, Published on July 14, 2025, Accessed on October 2, 2025.
-
24. The Git Project, "git-diff — Show changes between commits, commit and working tree, etc.", Git Documentation, https://git-scm.com/docs/git-diff, Accessed on October 2, 2025.
-
25. Sorhus, S., et al., "execa — Process execution for humans", GitHub repository, https://github.com/sindresorhus/execa, Accessed on October 2, 2025.
-
26. Holowaychuk, T.J., and Commander.js contributors, "Commander.js — Node.js command-line interfaces made easy", GitHub repository, https://github.com/tj/commander.js, Accessed on October 2, 2025.
-
27. motdotla, et al., "dotenv — Loads environment variables from .env", GitHub repository, https://github.com/motdotla/dotenv, Accessed on October 2, 2025.
-
28. isaacs, et al., "glob — Match files using the patterns the shell uses", GitHub repository, https://github.com/isaacs/node-glob, Accessed on October 2, 2025.
-
29. Akıncı, F., "@furkanakinci/contract-unit-test-maintainer", npm, https://www.npmjs.com/package/@furkanakinci/contract-unit-test-maintainer, Accessed on October 2, 2025.
-
30. Brito, A., et al., "APIDiff: Detecting API breaking changes", 2018 IEEE 25th International Conference on Software Analysis, Evolution and Reengineering (SANER), Campobasso, Italy, pp. 1-10, March 20-23, 2018.
-
31. Jezek, K., Dietrich, J., Brada, P., "How java apis break–an empirical study", Information and Software Technology, 65, pp. 129-146, 2015.
-
32. Dig, D., Johnson, R., "How do APIs evolve? A story of refactoring", Journal of software maintenance and evolution: Research and Practice, 18(2), pp. 83-107, 2006.