Research Article
BibTex RIS Cite

Sürüm Notu Açıklamaların BERT Kullanarak Sınıflandırılması: Dağıtık Yazılım Geliştirmede Otomatik Versiyonlamaya İlk Adım

Year 2023, , 629 - 638, 31.12.2023
https://doi.org/10.24012/dumf.1345893

Abstract

Dağıtık Yazılım Geliştirme, farklı konumlarda bulunan bir takımla yazılım geliştirmenin uygulamasıdır. Yazılım versiyonlama süreci, geliştirilen çeşitli yazılım sürümlerinin izlenmesine ve projelerin sürdürülmesine yardımcı olduğu için dağıtık geliştirmede kritik bir öneme sahiptir. Her yeni sürüm geçişinde, geliştirme ekibi, tüm ekip üyelerini ve paydaşları değişiklikler hakkında bilgilendiren ve proje süreçlerinin takip edilmesini sağlayan sürüm notları sunar. Sürüm notları, yeni bir yazılım sürümündeki özellikler, hata düzeltmeleri ve diğer değişiklikler hakkında bilgi içerir. Yeni yazılım sürümleri için sürüm notları oluşturmak ve sürüm geçiş zamanını belirlemek maliyetli olabilir. Literatürde sürüm notlarının oluşturulması hakkında bazı makaleler bulunmasına rağmen otomatik sürümleme hakkında hiçbir çalışma yoktur. Bu bağlamda, bu çalışmanın amacı, gelecekteki çalışmalarda oluşturulması planlanan bir otomatik sürümleme aracının ilk aşaması olarak sürüm notlarındaki geliştirme türlerini tahmin edilmesidir. Sürüm notlarındaki geliştirme türünü sınıflandırmak için popüler bir transformer olan BERT'i kullanılmış ve model kendi açık veri setimizde %86 doğruluk oranı elde etmiştir. Ayrıca, bu çalışmada, ELI5 kütüphanesi kullanarak açıklanabilir yapay zeka bağlamında modelin karar verme sürecine ilişkin fikirler de sunulmuştur.

References

  • [1] A. B. Marques, R. Rodrigues, and T. Conte, ‘Systematic literature reviews in distributed software development: A tertiary study’, in IEEE 7th International Conference on Global Software Engineering, ICGSE 2012, 2012, pp. 134–143. doi: 10.1109/ICGSE.2012.29.
  • [2] L. Linsbauer, F. Schwägerl, T. Berger, and P. Grünbacher, ‘Concepts of variation control systems’, Journal of Systems and Software, vol. 171, p. 110796, Jan. 2021, doi: 10.1016/J.JSS.2020.110796.
  • [3] A. M. Aytekin, ‘Release Management with Continuous Delivery: A Case Study’, Release Management with Continuous Delivery: A Case Study, vol. 8, no. 9, 2014, Accessed: Jan. 08, 2023. [Online]. Available: https://publications.waset.org/9999440/release-management-with-continuous-delivery-a-case-study
  • [4] L. Layman, L. Williams, D. Damian, and H. Bures, ‘Essential communication practices for Extreme Programming in a global software development team’, Inf Softw Technol, vol. 48, no. 9, pp. 781–794, Sep. 2006, doi: 10.1016/J.INFSOF.2006.01.004.
  • [5] T. Preston-Werner, ‘Semantic Versioning 2.0.0 | Semantic Versioning’. Accessed: Jan. 10, 2023. [Online]. Available: https://semver.org/
  • [6] G. Karsai and D. Balasubramanian, ‘Assurance Provenance: The Next Challenge in Software Documentation’, in Leveraging Applications of Formal Methods, Verification and Validation. Software Engineering, vol. 13702, T. Margaria and B. Steffen, Eds., Springer, Cham, 2022, pp. 90–104. doi: 10.1007/978-3-031-19756-7_6.
  • [7] A. C. B. G. da Silva, G. de F. Carneiro, F. Brito e Abreu, and M. P. Monteiro, ‘Frequent Releases in Open Source Software: A Systematic Review’, Information 2017, Vol. 8, Page 109, vol. 8, no. 3, p. 109, Sep. 2017, doi: 10.3390/INFO8030109.
  • [8] S. S. Nath and B. Roy, ‘Automatically Generating Release Notes with Content Classification Models’, International Journal of Software Engineering and Knowledge Engineering, vol. 31, no. 11–12, pp. 1721–1740, Jan. 2022, doi: 10.1142/S0218194021400192.
  • [9] L. Moreno, G. Bavota, M. Di Penta, R. Oliveto, A. Marcus, and G. Canfora, ‘ARENA: An Approach for the Automated Generation of Release Notes’, IEEE Transactions on Software Engineering, vol. 43, no. 2, pp. 106–127, Feb. 2017, doi: 10.1109/TSE.2016.2591536.
  • [10] A. Şeker, B. Diri, and H. Arslan, ‘Using Open Source Distributed Code Development Features on GitHub: A Real-World Example’, in 2nd International Eurasian Conference on Science, Engineering and Technology, 2020, pp. 518–525.
  • [11] K. Herzig, S. Just, and A. Zeller, ‘It’s not a bug, it’s a feature: How misclassification impacts bug prediction’, in Proceedings - International Conference on Software Engineering, 2013, pp. 392–401. doi: 10.1109/ICSE.2013.6606585.
  • [12] M. Ohira et al., ‘A dataset of high impact bugs: Manually-classified issue reports’, in IEEE International Working Conference on Mining Software Repositories, IEEE Computer Society, Aug. 2015, pp. 518–521. doi: 10.1109/MSR.2015.78.
  • [13] A. Şeker, S. Yeşilyurt, İ. Can Ardahan, and B. Çınar, ‘Prediction of Development Types from Release Notes for Automatic Versioning of OSS Projects’, in Smart Applications with Advanced Machine Learning and Human-Centred Problem Design, Springer International Publishing, 2023, pp. 399–407. doi: 10.1007/978-3-031-09753-9_28.
  • [14] M. Ali, A. Aftab, and W. H. Buttt, ‘Automatic Release Notes Generation’, Proceedings of the IEEE International Conference on Software Engineering and Service Sciences, ICSESS, vol. 2020-October, pp. 76–81, Oct. 2020, doi: 10.1109/ICSESS49938.2020.9237671.
  • [15] S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, ‘Deep Learning--based Text Classification: A Comprehensive Review’, ACM Computing Surveys (CSUR), vol. 54, no. 3, Apr. 2021, doi: 10.1145/3439726.
  • [16] A. Gasparetto, M. Marcuzzo, A. Zangari, and A. Albarelli, ‘A Survey on Text Classification Algorithms: From Text to Predictions’, Information 2022, Vol. 13, Page 83, vol. 13, no. 2, p. 83, Feb. 2022, doi: 10.3390/INFO13020083.
  • [17] G. Soyalp, A. Alar, K. Ozkanli, and B. Yildiz, ‘Improving Text Classification with Transformer’, Proceedings - 6th International Conference on Computer Science and Engineering, UBMK 2021, pp. 707–712, 2021, doi: 10.1109/UBMK52708.2021.9558906.
  • [18] X. Chen, P. Cong, and S. Lv, ‘A Long-Text Classification Method of Chinese News Based on BERT and CNN’, IEEE Access, vol. 10, pp. 34046–34057, 2022, doi: 10.1109/ACCESS.2022.3162614.
  • [19] ‘Mozilla Firefox Release Notes’. Accessed: Jan. 16, 2023. [Online]. Available: https://www.mozilla.org/en-US/firefox/releases/
  • [20] ‘Thunderbird Release Notes — Thunderbird’. Accessed: Jan. 16, 2023. [Online]. Available: https://www.thunderbird.net/en-US/thunderbird/releases/
  • [21] ‘Slack for Windows - Release Notes | Slack’. Accessed: Jan. 16, 2023. [Online]. Available: https://slack.com/release-notes/windows
  • [22] ‘Releases · obsproject/obs-studio’. Accessed: Jan. 16, 2023. [Online]. Available: https://github.com/obsproject/obs-studio/releases
  • [23] A. Vaswani et al., ‘Attention is All you Need’, in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2017.
  • [24] J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, ‘Bert: Pre-training of deep bidirectional transformers for language understanding’, in Proceedings of NAACL-HLT 2019, Minnesota, 2019, pp. 4171–4186. Accessed: Jan. 17, 2023. [Online]. Available: https://arxiv.org/abs/1810.04805
  • [25] M. Grandini, E. Bagli, and G. Visani, ‘Metrics for Multi-Class Classification: an Overview’, arXiv preprint arXiv:2008.05756, Aug. 2020.
  • [26] ‘ELI5 ’. Accessed: Jan. 24, 2023. [Online]. Available: https://eli5.readthedocs.io/en/latest/overview.html

Classifying Release Notes Explanations using BERT: An Initial Step to Automatic Versioning in Distributed Software Development

Year 2023, , 629 - 638, 31.12.2023
https://doi.org/10.24012/dumf.1345893

Abstract

Distributed Software Development is the practice of developing software with a team in different locations. The process of software versioning is crucial in distributed development as it helps in keeping track of the various software versions that are being developed and maintaining projects. In transition of each new version, the development team present release notes that inform all team members and stakeholders are aware of changes and provide tracking project progresses. Release notes consist information about the features, bug fixes, and other changes included in a new software release. Generating release notes and determining the release transition timing for new software versions can be costly. Despite of there are some papers about generating release notes in the literature, there is not any study about automatic versioning. In this context, the aim of this paper is to predict the development types in release notes as the first phase of an automated versioning tool that is planned to be built in future work. We used BERT which is one of the popular transformers to classify developments of release notes and our model has 86% accuracy rate on our own public dataset. Additionally, we presented insights on the model's decision-making process in the context of explainable AI using ELI5 library.

References

  • [1] A. B. Marques, R. Rodrigues, and T. Conte, ‘Systematic literature reviews in distributed software development: A tertiary study’, in IEEE 7th International Conference on Global Software Engineering, ICGSE 2012, 2012, pp. 134–143. doi: 10.1109/ICGSE.2012.29.
  • [2] L. Linsbauer, F. Schwägerl, T. Berger, and P. Grünbacher, ‘Concepts of variation control systems’, Journal of Systems and Software, vol. 171, p. 110796, Jan. 2021, doi: 10.1016/J.JSS.2020.110796.
  • [3] A. M. Aytekin, ‘Release Management with Continuous Delivery: A Case Study’, Release Management with Continuous Delivery: A Case Study, vol. 8, no. 9, 2014, Accessed: Jan. 08, 2023. [Online]. Available: https://publications.waset.org/9999440/release-management-with-continuous-delivery-a-case-study
  • [4] L. Layman, L. Williams, D. Damian, and H. Bures, ‘Essential communication practices for Extreme Programming in a global software development team’, Inf Softw Technol, vol. 48, no. 9, pp. 781–794, Sep. 2006, doi: 10.1016/J.INFSOF.2006.01.004.
  • [5] T. Preston-Werner, ‘Semantic Versioning 2.0.0 | Semantic Versioning’. Accessed: Jan. 10, 2023. [Online]. Available: https://semver.org/
  • [6] G. Karsai and D. Balasubramanian, ‘Assurance Provenance: The Next Challenge in Software Documentation’, in Leveraging Applications of Formal Methods, Verification and Validation. Software Engineering, vol. 13702, T. Margaria and B. Steffen, Eds., Springer, Cham, 2022, pp. 90–104. doi: 10.1007/978-3-031-19756-7_6.
  • [7] A. C. B. G. da Silva, G. de F. Carneiro, F. Brito e Abreu, and M. P. Monteiro, ‘Frequent Releases in Open Source Software: A Systematic Review’, Information 2017, Vol. 8, Page 109, vol. 8, no. 3, p. 109, Sep. 2017, doi: 10.3390/INFO8030109.
  • [8] S. S. Nath and B. Roy, ‘Automatically Generating Release Notes with Content Classification Models’, International Journal of Software Engineering and Knowledge Engineering, vol. 31, no. 11–12, pp. 1721–1740, Jan. 2022, doi: 10.1142/S0218194021400192.
  • [9] L. Moreno, G. Bavota, M. Di Penta, R. Oliveto, A. Marcus, and G. Canfora, ‘ARENA: An Approach for the Automated Generation of Release Notes’, IEEE Transactions on Software Engineering, vol. 43, no. 2, pp. 106–127, Feb. 2017, doi: 10.1109/TSE.2016.2591536.
  • [10] A. Şeker, B. Diri, and H. Arslan, ‘Using Open Source Distributed Code Development Features on GitHub: A Real-World Example’, in 2nd International Eurasian Conference on Science, Engineering and Technology, 2020, pp. 518–525.
  • [11] K. Herzig, S. Just, and A. Zeller, ‘It’s not a bug, it’s a feature: How misclassification impacts bug prediction’, in Proceedings - International Conference on Software Engineering, 2013, pp. 392–401. doi: 10.1109/ICSE.2013.6606585.
  • [12] M. Ohira et al., ‘A dataset of high impact bugs: Manually-classified issue reports’, in IEEE International Working Conference on Mining Software Repositories, IEEE Computer Society, Aug. 2015, pp. 518–521. doi: 10.1109/MSR.2015.78.
  • [13] A. Şeker, S. Yeşilyurt, İ. Can Ardahan, and B. Çınar, ‘Prediction of Development Types from Release Notes for Automatic Versioning of OSS Projects’, in Smart Applications with Advanced Machine Learning and Human-Centred Problem Design, Springer International Publishing, 2023, pp. 399–407. doi: 10.1007/978-3-031-09753-9_28.
  • [14] M. Ali, A. Aftab, and W. H. Buttt, ‘Automatic Release Notes Generation’, Proceedings of the IEEE International Conference on Software Engineering and Service Sciences, ICSESS, vol. 2020-October, pp. 76–81, Oct. 2020, doi: 10.1109/ICSESS49938.2020.9237671.
  • [15] S. Minaee, N. Kalchbrenner, E. Cambria, N. Nikzad, M. Chenaghlu, and J. Gao, ‘Deep Learning--based Text Classification: A Comprehensive Review’, ACM Computing Surveys (CSUR), vol. 54, no. 3, Apr. 2021, doi: 10.1145/3439726.
  • [16] A. Gasparetto, M. Marcuzzo, A. Zangari, and A. Albarelli, ‘A Survey on Text Classification Algorithms: From Text to Predictions’, Information 2022, Vol. 13, Page 83, vol. 13, no. 2, p. 83, Feb. 2022, doi: 10.3390/INFO13020083.
  • [17] G. Soyalp, A. Alar, K. Ozkanli, and B. Yildiz, ‘Improving Text Classification with Transformer’, Proceedings - 6th International Conference on Computer Science and Engineering, UBMK 2021, pp. 707–712, 2021, doi: 10.1109/UBMK52708.2021.9558906.
  • [18] X. Chen, P. Cong, and S. Lv, ‘A Long-Text Classification Method of Chinese News Based on BERT and CNN’, IEEE Access, vol. 10, pp. 34046–34057, 2022, doi: 10.1109/ACCESS.2022.3162614.
  • [19] ‘Mozilla Firefox Release Notes’. Accessed: Jan. 16, 2023. [Online]. Available: https://www.mozilla.org/en-US/firefox/releases/
  • [20] ‘Thunderbird Release Notes — Thunderbird’. Accessed: Jan. 16, 2023. [Online]. Available: https://www.thunderbird.net/en-US/thunderbird/releases/
  • [21] ‘Slack for Windows - Release Notes | Slack’. Accessed: Jan. 16, 2023. [Online]. Available: https://slack.com/release-notes/windows
  • [22] ‘Releases · obsproject/obs-studio’. Accessed: Jan. 16, 2023. [Online]. Available: https://github.com/obsproject/obs-studio/releases
  • [23] A. Vaswani et al., ‘Attention is All you Need’, in Advances in Neural Information Processing Systems, Curran Associates, Inc., 2017.
  • [24] J. Devlin, M.-W. Chang, K. Lee, K. T. Google, and A. I. Language, ‘Bert: Pre-training of deep bidirectional transformers for language understanding’, in Proceedings of NAACL-HLT 2019, Minnesota, 2019, pp. 4171–4186. Accessed: Jan. 17, 2023. [Online]. Available: https://arxiv.org/abs/1810.04805
  • [25] M. Grandini, E. Bagli, and G. Visani, ‘Metrics for Multi-Class Classification: an Overview’, arXiv preprint arXiv:2008.05756, Aug. 2020.
  • [26] ‘ELI5 ’. Accessed: Jan. 24, 2023. [Online]. Available: https://eli5.readthedocs.io/en/latest/overview.html
There are 26 citations in total.

Details

Primary Language English
Subjects Software Engineering (Other)
Journal Section Articles
Authors

Abdulkadir Şeker 0000-0002-4552-2676

Early Pub Date December 31, 2023
Publication Date December 31, 2023
Submission Date August 18, 2023
Published in Issue Year 2023

Cite

IEEE A. Şeker, “Classifying Release Notes Explanations using BERT: An Initial Step to Automatic Versioning in Distributed Software Development”, DÜMF MD, vol. 14, no. 4, pp. 629–638, 2023, doi: 10.24012/dumf.1345893.
DUJE tarafından yayınlanan tüm makaleler, Creative Commons Atıf 4.0 Uluslararası Lisansı ile lisanslanmıştır. Bu, orijinal eser ve kaynağın uygun şekilde belirtilmesi koşuluyla, herkesin eseri kopyalamasına, yeniden dağıtmasına, yeniden düzenlemesine, iletmesine ve uyarlamasına izin verir. 24456