Research Article
BibTex RIS Cite

Yazılım geliştirme taleplerinin metin madenciliği yöntemleriyle önceliklendirilmesi

Year 2019, Volume: 25 Issue: 5, 615 - 620, 21.10.2019

Abstract

Kurumsal
şirketlerde, yazılımlardaki hatalar ve değişiklik talepleri genellikle bir
talep yönetim sistemi üzerinden Bilgi Teknolojileri (BT) birimine iletilir. Bu
sistemde yer alan öncelik bilgisi BT birimi için kritik öneme sahiptir. Ancak,
talebi giren kişilerin inisiyatifine bırakılan öncelik kararı her zaman
gerçekçi olmamaktadır. Örneğin, kritik olmayan ve düşük öncelikli bir
değişiklik talebi yüksek öncelikli olarak girilebilmekte, bu da hatalı planlama
ve müşteri memnuniyetsizliği ile sonuçlanabilmektedir. Bu çalışmada, iç müşteri
talepleri metin madenciliği yöntemleriyle sınıflandırılarak taleplerin önem
derecesi tahmin edilmeye çalışılmıştır. Sistemin eğitimi ve testi için kurumsal
bir şirketin talep yönetim sisteminden alınan kayıtlar kullanılmıştır. Ham
metin formundaki talep verisi üzerinde temizlik ve önişleme işlemlerinin
ardından, doküman-terim matrisinin oluşturulmasında TF-IDF (Terim Frekansı –
Ters Doküman Frekansı) ağırlıklandırma yönteminden yararlanılmıştır.
Oluşturulan veri seti üzerinde çeşitli sınıflandırma algoritmaları test edilmiş
ve en yüksek başarım %54.1 F-Skoru ile Sequential Minimal Optimization
algoritmasıyla elde edilmiştir. Ayrıca, aşırı örnekleme yoluyla sınıfların
dengeli hale getirildiği veri seti üzerinde ise en yüksek başarıma %74.5
F-Skoru değeri ile Random Forest algoritmasıyla ulaşılmıştır.

References

  • Uddin J, Ghazali R, Deris MM, Naseem R, Shah H. "A survey on bug prioritization". Artificial Intelligence Review, 47(2), 145-180, 2017.
  • Tian Y, Lo D, Sun C. "Information Retrieval Based Nearest Neighbor Classification for Fine-Grained Bug Severity Prediction". 19th Working Conference on Reverse Engineering, Ontario, Canada, 15-18 October 2012.
  • Sharma M, Bedi P, Chaturvedi KK, Singh VB. "Predicting the priority of a reported bug using machine learning techniques and cross project validation". 12th International Conference on Intelligent Systems Design and Applications (ISDA), Kochi, India, 27-29 November 2012.
  • Sharma G, Sharma S, Gujral S. "A novel way of assessing software bug severity using dictionary of critical terms". Procedia Computer Science, 70, 632-639, 2015.
  • Zhang T, Chen J, Yang G, Lee B, Luo X. "Towards more accurate severity prediction and fixer recommendation of software bugs". Journal of Systems and Software, 117, 166-184, 2016.
  • Kanwal J, Maqbool O. "Bug prioritization to facilitate bug report triage". Journal of Computer Science and Technology, 27(2), 397-412, 2012.
  • Kaushik N, Amoui M, Tahvildari L, Liu W, Li S. "Defect Prioritization in the Software Industry: Challenges and Opportunities". IEEE 6th International Conference on Software Testing, Verification and Validation, Luxembourg, Luxembourg, 18-22 March 2013.
  • Alenezi M, Banitaan S. "Bug Reports Prioritization: Which Features and Classifier to Use?". 12th International Conference on Machine Learning and Applications, Florida, USA, 4-7 December 2013.
  • Yang C, Chen K, Kao W. "Improving severity prediction on software bug reports using quality indicators". IEEE 5th International Conference on Software Engineering and Service Science, Beijing, China, 27-29 June 2014.
  • Tian Y, Lo D, Xia X, Sun C. "Automated prediction of bug report priority using multi-factor analysis". Empirical Software Engineering, 20(5), 1354-1383, 2015.
  • Schütze H, Manning CD, Raghavan P. Introduction to Information Retrieval. New York, USA, Cambridge University Press, 2008.
  • Han J, Kamber M. Data Mining: Concepts and Techniques. 2nd ed. California, USA, Morgan Kaufmann Publishers, 2006.
  • Kibriya AM, Frank E, Pfahringer B, Holmes G. "Multinomial naive bayes for text categorization revisited". 17th Australian joint conference on Advances in Artificial Intelligence, Cairns, Australia, 4-6 December 2004.
  • Platt JC. Fast Training of Support Vector Machines Using Sequential Minimal Optimization. Editors: Bernhard S, Christopher JCB, Alexander JS. Advances in Kernel Methods, 185-208, Massachusetts, USA, MIT Press, 1999.
  • Breiman L. "Random forests". Machine Learning, 45(1), 5-32, 2001.
  • Rodriguez JJ, Kuncheva LI, Alonso CJ. "Rotation forest: A new classifier ensemble method". IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1619-1630, 2006.
  • Rahman MM, Davis DN. "Addressing the class imbalance problem in medical datasets". International Journal of Machine Learning and Computing, 3(2), 224-228, 2013.
  • Tunalı V, Bilgin TT. "PRETO: A High-performance Text Mining Tool for Preprocessing Turkish Texts". International Conference on Computer Systems and Technologies, Ruse, Bulgaria, 22-23 June 2012.
  • Akın AA, Akın MD, "Zemberek, an open source NLP framework for Turkic Languages", 2007.
  • Eryiğit G, Adalı E. "An affix stripping morphological analyzer for Turkish". International Conference on Artificial Intelligence and Applications, Innsbruck, Austria, 16-18 February 2004.
  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. "The WEKA data mining software: an update". SIGKDD Explorations, 11(1), 10-18, 2009.

Prioritization of software development demands with text mining techniques

Year 2019, Volume: 25 Issue: 5, 615 - 620, 21.10.2019

Abstract

In
corporations, software issues and software change demands are forwarded to the
Information Technology (IT) unit via a demand management system. The priority
information in this system has critical importance to the IT unit. However, the
priority decision that is left to the individuals who create the demand records
may not always be realistic. For instance, a non-critical and low-priority
demand may be created with the highest priority, and this may lead to faulty
planning and eventually to customer dissatisfaction. In this work, internal
customer demands were classified using text mining techniques and their
priorities were predicted. The system was trained and tested with the records
extracted from the demand management system of a corporation. After cleaning
and preprocessing the raw textual demand data, TF-IDF (Term Frequency – Inverse
Document Frequency) weighting scheme was used when creating the document-term
matrix. Several classification algorithms were tested on the data set
generated, and the highest performance was obtained by Sequential Minimal
Optimization algorithm with 54.1% F-Score. In addition, on the dataset made
balanced with oversampling technique, the highest performance was achieved by
Random Forest algorithm with 74.5% F-Score.

References

  • Uddin J, Ghazali R, Deris MM, Naseem R, Shah H. "A survey on bug prioritization". Artificial Intelligence Review, 47(2), 145-180, 2017.
  • Tian Y, Lo D, Sun C. "Information Retrieval Based Nearest Neighbor Classification for Fine-Grained Bug Severity Prediction". 19th Working Conference on Reverse Engineering, Ontario, Canada, 15-18 October 2012.
  • Sharma M, Bedi P, Chaturvedi KK, Singh VB. "Predicting the priority of a reported bug using machine learning techniques and cross project validation". 12th International Conference on Intelligent Systems Design and Applications (ISDA), Kochi, India, 27-29 November 2012.
  • Sharma G, Sharma S, Gujral S. "A novel way of assessing software bug severity using dictionary of critical terms". Procedia Computer Science, 70, 632-639, 2015.
  • Zhang T, Chen J, Yang G, Lee B, Luo X. "Towards more accurate severity prediction and fixer recommendation of software bugs". Journal of Systems and Software, 117, 166-184, 2016.
  • Kanwal J, Maqbool O. "Bug prioritization to facilitate bug report triage". Journal of Computer Science and Technology, 27(2), 397-412, 2012.
  • Kaushik N, Amoui M, Tahvildari L, Liu W, Li S. "Defect Prioritization in the Software Industry: Challenges and Opportunities". IEEE 6th International Conference on Software Testing, Verification and Validation, Luxembourg, Luxembourg, 18-22 March 2013.
  • Alenezi M, Banitaan S. "Bug Reports Prioritization: Which Features and Classifier to Use?". 12th International Conference on Machine Learning and Applications, Florida, USA, 4-7 December 2013.
  • Yang C, Chen K, Kao W. "Improving severity prediction on software bug reports using quality indicators". IEEE 5th International Conference on Software Engineering and Service Science, Beijing, China, 27-29 June 2014.
  • Tian Y, Lo D, Xia X, Sun C. "Automated prediction of bug report priority using multi-factor analysis". Empirical Software Engineering, 20(5), 1354-1383, 2015.
  • Schütze H, Manning CD, Raghavan P. Introduction to Information Retrieval. New York, USA, Cambridge University Press, 2008.
  • Han J, Kamber M. Data Mining: Concepts and Techniques. 2nd ed. California, USA, Morgan Kaufmann Publishers, 2006.
  • Kibriya AM, Frank E, Pfahringer B, Holmes G. "Multinomial naive bayes for text categorization revisited". 17th Australian joint conference on Advances in Artificial Intelligence, Cairns, Australia, 4-6 December 2004.
  • Platt JC. Fast Training of Support Vector Machines Using Sequential Minimal Optimization. Editors: Bernhard S, Christopher JCB, Alexander JS. Advances in Kernel Methods, 185-208, Massachusetts, USA, MIT Press, 1999.
  • Breiman L. "Random forests". Machine Learning, 45(1), 5-32, 2001.
  • Rodriguez JJ, Kuncheva LI, Alonso CJ. "Rotation forest: A new classifier ensemble method". IEEE Transactions on Pattern Analysis and Machine Intelligence, 28(10), 1619-1630, 2006.
  • Rahman MM, Davis DN. "Addressing the class imbalance problem in medical datasets". International Journal of Machine Learning and Computing, 3(2), 224-228, 2013.
  • Tunalı V, Bilgin TT. "PRETO: A High-performance Text Mining Tool for Preprocessing Turkish Texts". International Conference on Computer Systems and Technologies, Ruse, Bulgaria, 22-23 June 2012.
  • Akın AA, Akın MD, "Zemberek, an open source NLP framework for Turkic Languages", 2007.
  • Eryiğit G, Adalı E. "An affix stripping morphological analyzer for Turkish". International Conference on Artificial Intelligence and Applications, Innsbruck, Austria, 16-18 February 2004.
  • Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. "The WEKA data mining software: an update". SIGKDD Explorations, 11(1), 10-18, 2009.
There are 21 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Research Article
Authors

Murat Can Tekin This is me

Volkan Tunalı

Publication Date October 21, 2019
Published in Issue Year 2019 Volume: 25 Issue: 5

Cite

APA Tekin, M. C., & Tunalı, V. (2019). Yazılım geliştirme taleplerinin metin madenciliği yöntemleriyle önceliklendirilmesi. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, 25(5), 615-620.
AMA Tekin MC, Tunalı V. Yazılım geliştirme taleplerinin metin madenciliği yöntemleriyle önceliklendirilmesi. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi. October 2019;25(5):615-620.
Chicago Tekin, Murat Can, and Volkan Tunalı. “Yazılım geliştirme Taleplerinin Metin madenciliği yöntemleriyle önceliklendirilmesi”. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi 25, no. 5 (October 2019): 615-20.
EndNote Tekin MC, Tunalı V (October 1, 2019) Yazılım geliştirme taleplerinin metin madenciliği yöntemleriyle önceliklendirilmesi. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi 25 5 615–620.
IEEE M. C. Tekin and V. Tunalı, “Yazılım geliştirme taleplerinin metin madenciliği yöntemleriyle önceliklendirilmesi”, Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, vol. 25, no. 5, pp. 615–620, 2019.
ISNAD Tekin, Murat Can - Tunalı, Volkan. “Yazılım geliştirme Taleplerinin Metin madenciliği yöntemleriyle önceliklendirilmesi”. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi 25/5 (October 2019), 615-620.
JAMA Tekin MC, Tunalı V. Yazılım geliştirme taleplerinin metin madenciliği yöntemleriyle önceliklendirilmesi. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi. 2019;25:615–620.
MLA Tekin, Murat Can and Volkan Tunalı. “Yazılım geliştirme Taleplerinin Metin madenciliği yöntemleriyle önceliklendirilmesi”. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, vol. 25, no. 5, 2019, pp. 615-20.
Vancouver Tekin MC, Tunalı V. Yazılım geliştirme taleplerinin metin madenciliği yöntemleriyle önceliklendirilmesi. Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi. 2019;25(5):615-20.





Creative Commons Lisansı
Bu dergi Creative Commons Al 4.0 Uluslararası Lisansı ile lisanslanmıştır.