An Automated Bug Triaging Approach using Deep Learning: A Replication Study

Eray Tüzün; Alperen Çetin; Emre Doğan

doi:10.31590/ejosat.781341

Araştırma Makalesi

An Automated Bug Triaging Approach using Deep Learning: A Replication Study

Yıl 2021, Sayı: 21, 268 - 274, 31.01.2021

Eray Tüzün , Alperen Çetin Emre Doğan

https://doi.org/10.31590/ejosat.781341

Öz

Bug management is the process to identify and fix bugs. In the bug management process, after a bug is identified, it needs to triaged. Bug triaging is the process of prioritizing bugs and assigning an appropriate developer for a given bug. The main task in bug triaging is to predict the most appropriate developer to fix a software bug from a given bug report. This problem can be defined as a classification problem in which textual bug attributes (bug title, description etc.) are inputs and the available developer (class label) is the output. Since manual bug triaging is a time consuming process, there have een several bug triaging algorithms to automate this process. One of the latest successful algorithms to address this problem is the Deep Triage. It employs Deep Bidirectional Recurrent Neural Network with Attention (DBRNN-A) approach for this classification task.

In this study, we implement an improved version of the automated bug triaging method, DeepTriage. To improve the performance of the model, three contributions are made to the original implementation: (1) Using GRU instead of LSTM to fasten the training process by using a larger batch size with the same memory usage, (2) Using a corpus combining the data from different datasets to create a more generalized model, (3) Adding extra dense layers before the multiclass classification to improve the results. After running the experiments, we achieved the state of the art results in Mozilla Firefox dataset, an accuracy of 46.6%. In the Chromium dataset, we get a higher accuracy (44.0%) than the original accuracy from the paper (42.7%). The resulting model and its source code is made publicly available for future research in this area.

Anahtar Kelimeler

recurrent neural networks, long short term memory, gated recurrent unit, bug triaging

Kaynakça

Anvik, J., Hiew, L., & Murphy, G. C. (2006). Who should fix this bug? In Proceedings - International Conference on Software Engineering (Vol. 2006, pp. 361–370). New York, New York, USA: IEEE Computer Society. https://doi.org/10.1145/1134285.1134336
Cubranic, D., & Murphy, G. C. (2004). Automatic bug triage using text categorization. 16th Int. Conference on Software Engineering and Knowledge Engineering, 92–97. Retrieved from http://www.eclipse.org.
Hindle, A., Barr, E. T., Su, Z., Gabel, M., & Devanbu, P. (2012). On the naturalness of software. In Proceedings - International Conference on Software Engineering (pp. 837–847). https://doi.org/10.1109/ICSE.2012.6227135
Jeong, G., Kim, S., & Zimmermann, T. (2009). Improving bug triage with bug tossing graphs. In ESEC-FSE’09 - Proceedings of the Joint 12th European Software Engineering Conference and 17th ACM SIGSOFT Symposium on the Foundations of Software Engineering (pp. 111–120). https://doi.org/10.1145/1595696.1595715
Mani, S., Sankaran, A., & Aralikatte, R. (2019). Deeptriage: Exploring the effectiveness of deep learning for bug triaging. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data (pp. 171–179). https://doi.org/10.1145/3297001.3297023
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings. International Conference on Learning Representations, ICLR. Retrieved from http://ronan.collobert.com/senna/ QA: Quality assurance at Mozilla - Mozilla | MDN. (n.d.). Retrieved September 6, 2020, from https://developer.mozilla.org/en-US/docs/Mozilla/QA
Xuan, J., Jiang, H., Ren, Z., Yan, J., & Luo, Z. (2010). Automatic bug triage using semi-supervised text classification. In SEKE 2010 - Proceedings of the 22nd International Conference on Software Engineering and Knowledge Engineering (pp. 209–214). Retrieved from http://arxiv.org/abs/1704.04769

Derin Öğrenme ile Otomatik Hata Triyajlama: Bir Replikasyon Çalışması

Yıl 2021, Sayı: 21, 268 - 274, 31.01.2021

Eray Tüzün , Alperen Çetin Emre Doğan

https://doi.org/10.31590/ejosat.781341

Öz

Hata yönetimi hataları belirleme ve çözme sürecidir. Hata yönetimi sürecinde, bir hatanın belirlendikten sonra triyajlanması gerekir. Hata triyajlama süreci hatanın önceliklendirilmesi ve hatanın uygun bir geliştiriciye atanması şeklinde gerçekleşir. Bu sürecin asıl kısmı verilen bir hata raporunu çözmek için en uygun geliştiriciyi tahmin edebilmektir. Bu hata raporlarının metinsel kısımlarının (hata başlığı, hata tanımı) girdi olduğu ve önerilecek olan geliştiricilerin de çıktı olduğu bir sınıflandırma problem olarak tanımlanabilir. Otomatik olarak yapılmayan hata triyajlama zaman alan bir süreç olduğundan, hata triyajlamayı otomatik hale getirmek üzerine birçok algoritma bulunmaktadır. Geçtiğimiz yıllarda bu problem üzerinde çalışan en son başarılı modellerden biri de Deep Triage’dır. Bu model sınıflandırma için derin, iki yönlü ve dikkatli tekrarlayan sinir ağı (DBRNN-A) kullanmaktadır.

Bu çalışmada biz literatürdeki başarılı bir hata triyajlama yöntemi olan Deep Triage’ın geliştirilmiş bir versiyonunu gerçekleştirdik. Makalede önceden önerilen modelin performansını artırmak için original çalışmaya üç katkıda bulunduk: (1) Aynı bellek miktarıyla daha büyük veri grupları kullanarak eğitme zamanını düşürmek için LSTM yerine GRU kullanmak, (2) Daha genel bir model oluşturmak için farklı veri setlerinin birleşmesinden oluşan bir sözlük kullanma ve (3) Sonuçları iyileştirmek için çok sınıflı sınıflandırmadan önce ilave sinir ağı katmanları koyma. Gerçekleştirdiğimiz deneylerin sonucunda Mozilla Firefox veri setinde %46.6 doğruluk ile original çalışmayla aynı sonuçları elde ettik. Chromium ver setinde ise orijinal çalışmadan (%42.7) daha yüksek bir doğruluk (%44.0) elde ettik. Bu konu hakkındaki ilerideki çalışmalar için geliştirilmiş model ve kaynak kodu paylaşılmıştır.

Anahtar Kelimeler

tekrarlayan sinir ağı, uzun kısa süreli bellek, kapılı tekrarlayan birim, hata triyajlama

Kaynakça

Anvik, J., Hiew, L., & Murphy, G. C. (2006). Who should fix this bug? In Proceedings - International Conference on Software Engineering (Vol. 2006, pp. 361–370). New York, New York, USA: IEEE Computer Society. https://doi.org/10.1145/1134285.1134336
Cubranic, D., & Murphy, G. C. (2004). Automatic bug triage using text categorization. 16th Int. Conference on Software Engineering and Knowledge Engineering, 92–97. Retrieved from http://www.eclipse.org.
Hindle, A., Barr, E. T., Su, Z., Gabel, M., & Devanbu, P. (2012). On the naturalness of software. In Proceedings - International Conference on Software Engineering (pp. 837–847). https://doi.org/10.1109/ICSE.2012.6227135
Jeong, G., Kim, S., & Zimmermann, T. (2009). Improving bug triage with bug tossing graphs. In ESEC-FSE’09 - Proceedings of the Joint 12th European Software Engineering Conference and 17th ACM SIGSOFT Symposium on the Foundations of Software Engineering (pp. 111–120). https://doi.org/10.1145/1595696.1595715
Mani, S., Sankaran, A., & Aralikatte, R. (2019). Deeptriage: Exploring the effectiveness of deep learning for bug triaging. In Proceedings of the ACM India Joint International Conference on Data Science and Management of Data (pp. 171–179). https://doi.org/10.1145/3297001.3297023
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings. International Conference on Learning Representations, ICLR. Retrieved from http://ronan.collobert.com/senna/ QA: Quality assurance at Mozilla - Mozilla | MDN. (n.d.). Retrieved September 6, 2020, from https://developer.mozilla.org/en-US/docs/Mozilla/QA
Xuan, J., Jiang, H., Ren, Z., Yan, J., & Luo, Z. (2010). Automatic bug triage using semi-supervised text classification. In SEKE 2010 - Proceedings of the 22nd International Conference on Software Engineering and Knowledge Engineering (pp. 209–214). Retrieved from http://arxiv.org/abs/1704.04769

Toplam 7 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	İngilizce
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Eray Tüzün 0000-0002-5550-7816 Alperen Çetin Bu kişi benim 0000-0001-9879-8599 Emre Doğan Bu kişi benim 0000-0002-2558-7624
Yayımlanma Tarihi	31 Ocak 2021
Yayımlandığı Sayı	Yıl 2021 Sayı: 21

Kaynak Göster

APA	Tüzün, E., Çetin, A., & Doğan, E. (2021). An Automated Bug Triaging Approach using Deep Learning: A Replication Study. Avrupa Bilim Ve Teknoloji Dergisi(21), 268-274. https://doi.org/10.31590/ejosat.781341

Kapak Resmi İndir

Makale Dosyaları

Tam Metin