Research Article
BibTex RIS Cite

Using Text Mining For Research Trends in Empirical Software Engineering

Year 2021, , 1227 - 1235, 01.09.2021
https://doi.org/10.2339/politeknik.831391

Abstract

This paper intends to examine the research trends in Empirical Software Engineering domain within the last two decades using text mining. It studies published articles in the relevant literature with an emphasis on abstracts of 10658 articles published in the literature on Experimental Software Engineering domain. Using a probabilistic topic modelling technique (Latent Dirichlet Allocation), it brings forward the main topics of research within this domain. By further analysis, the paper evaluates the changes of focus in published works in the last two decades and depicts the recent trends in research content wise. Through a timely comparison, it portrays the alteration of interest within empirical software engineering research and proposes a future research agenda to develop an advanced field, beneficial both for academics and practitioners.

References

  • [1] Garousi V. and Mäntylä M., “Citations, research topics and active countries in software engineering: A bibliometrics study”, Computer Science Review, 19: 56-77, (2016).
  • [2] Höfer A. and Tichy W. F., “Status of empirical research in software engineering”, in Basili, V. (Eds.) et al., Empirical Software Engineering Issues, LNCS 4336, 10-19, Springer-Verlag, (2007).
  • [3] Dieste O., Juristo N. and Martínez M.D., “Software industry experiments: A systematic literature review”, 1st International Workshop on Conducting Empirical Studies in Industry (CESI), 2-8, (2013).
  • [4] Basili V., “The Role of Controlled Experiments in Software Engineering Research”, Empirical Software Engineering Issues, LNCS 4336, 33-37, Springer-Verlag, (2007).
  • [5] Jeffery R., Scott L., “Has twenty-five years of empirical software engineering made a difference?”, 9th Asia-Pacific Software Engineering Conference, Australia, 539- 546, (2002).
  • [6] Kitchenham B., “Empirical paradigm - the role of experiments”, International Conference on Empirical Software Engineering Issues: Critical Assessment and Future Direction, 25-32, (2006).
  • [7] Kitchenham B.A., Pfleeger S.L., Pickard L.M., Jones P.W., Hoaglin D.C., El Emam K. and Rosenberg J., “Preliminary guidelines for empirical research in software engineering”, IEEE Transactions on Software Engineering, 28(8):721-734, (2002).
  • [8] Sjoberg D.I.K., Dyba T. and Jorgensen M., “The future of empirical methods in software engineering research”, Future of Software Engineering, Minneapolis, MN, 358-378, (2007).
  • [9] MacDonell S., Shepperd M. and Kitchenham B., “How eeliable are systematic reviews in empirical software engineering?”, IEEE Transactions on Software Engineering, 36(5): 676-687, (2010).
  • [10] Mathew G., Agrawal A. and Menzies T., “Finding trends in software research”, IEEE Transactions on Software Engineering, (2018).https://arxiv.org/pdf/1608.08100.pdf .
  • [11] Dieste O., Grimán A. and Juristo N., “Developing search strategies for detecting relevant experiments”, Empirical Software Engineering, 14(5): 513-539, (2009).
  • [12] Rainer A., “The Value of Empirical Evidence for Practitioners and Researches”, Empirical Software Engineering Issues, LNCS 4336, 24, (2007).
  • [13] Malhotra R., “Empirical Research in Software Engineering, Concepts, Analysis and Applications”, CRC Press, (2016).
  • [14] Calheiros A. C., Moro S. and Rita P., “Sentiment classification of consumer generated online reviews using topic modeling”, Journal of Hospitality Marketing and Management, (2017). http://dx. doi.org /10.1080/19368623.2017.1310075.
  • [15] Wei X. and Croft W. B., “LDA-based document models for ad-hoc re- trieval”, in Proc. of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 178–185, (2006).
  • [16] Bauer S., Noulas A., Séaghdha D.O., Clark S. and Mascolo C., “Talking places: Modelling and analyzing linguistic content in foursquare”, International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2012 International Conference on Social Computing (SocialCom), (2012).
  • [17] Vulic I., De Smet W. and Moens M.F., “Identifying word translations from comparable corpora using latent topic models”, in Proc. of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2, (2011).
  • [18] Chen Y., Rhaad Rabbani M., Gupta A. and Mohammed Zak J., “Comparative text analytics via topic modeling in banking,”, IEEE Symposium Series on Computational Intelligence (SSCI), (2017).
  • [19] Maier D., Waldherr A., Miltner P., Wiedemann G., Niekler A., Keinert A., Pfetsch B., Heyer G., Reber U., Häussler T., Schmid-Petri H. and Adam S., “Applying LDA topic modeling in communication research: Toward a Valid and Reliable Methodology”, Communication Methods and Measures, 12(2-3): 93-118, (2018). [20] Alghamdi R. and Alfalqi K., “A Survey of Topic modeling in text mining”, International Journal of Advanced Computer Science and Applications (IJACSA), 6(1)(2015).
  • [21] Albalawi R., Yeap T.H. and Benyoucef M., “Using topic modeling methods for short-text data: A Comparative Analysis”, Frontiers in Artificial Intelligence, 3:42, (2020).
  • [22] Chakkarwar V. and Tamane S. C., “Quick insight of research literature using topic modeling”, in Smart Trends in Computing and Communications. Smart Innovation, Systems and Technologies, Zhang Y. D., Mandal J., So-In C. and Thakur N. (Eds.), Springer, Singapore, 165: 189–197, (2020).
  • [23] Vayansky I. and Kumar S.A.P., “A review of topic modeling methods”, Information Systems, 94, (2020).
  • [24] Guzman E. and Maalej W., “How do users like this feature? a fine grained sentiment analysis of app reviews,”, IEEE 22nd International Requirements Engineering Conference (RE), IEEE, 153–162, (2014).
  • [25] Thomas S. W., Hemmati H., Hassan A. E. and Blostein D., “Static test case prioritization using topic models,”, Empirical Software Engineering, 19(1): 182–212, (2014).
  • [26] Gethers M. and Poshyvanyk D., “Using relational topic models to capture coupling among classes in object-oriented software systems”, IEEE International Conference on Software Maintenance (ICSM), (2010).
  • [27] Linstead E., Lopes C. and Baldi P., “An application of latent Dirichlet allocation to analyzing software evolution”, 7th International Conference on Machine Learning and Applications, IEEE, (2008).
  • [28] Chen T.H., Thomas S.W., Nagappan M. and Hassan A. E., “Explaining software defects using topic models”, in Proc. of the 9th IEEE Working Conference on Mining Software Repositories, IEEE Press, 189–198, (2012).
  • [29] Grimmer J. and Stewart B.M., “Text as data: The promise and pitfalls of automatic content analysis methods for political texts”, Political Analysis, 1–31.
  • [30] Chen H., Xie L., Leung C-C., Lu X., Ma B. and Li H., “Modeling latent topics and temporal distance for story segmentation of broadcast news”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 25(1):112–123, (2017).
  • [31] Bulut A., “TopicMachine: conversion prediction in search advertising using latent topic models”, IEEE Transactions on Knowledge and Data Engineering, 26(11), (2014).
  • [32] Kitchenham B. and Charters S., “Guidelines for Performing Systematic Literature Reviews in Software Engineering”, Technical Report EBSE 2007-001, Keele University and Durham University Joint Report.
  • [33] Han J., Kamber M. and Pei J., “Data Mining: Concepts and Techniques”, Morgan Kaufmann Series in Data Management Systems, (2011).
  • [34] Kampenes V. B., Anda B. and Dybå T., “Flexibility in research designs in empirical software engineering”, in Proc. of the 12th international conference on Evaluation and Assessment in Software Engineering, 49–57,(2008).
  • [35] Joshi M., Sidhu J. and Ubha D., “Reporting intellectual capital in annual reports in Australian software and IT companies”, Journal of Knowledge Management Practice, 11(3): 1-18, (2010).
  • [36] Döring H., “Methodological approaches for research on intangible resources and competitive success in software companies”, in Proc. of 17th European Conference on Research Methodology for Business and Management Studies, 123-129, (2018).
  • [37] Lesser E. and Ban L., “How leading companies practice software development and delivery to achieve a competitive edge”, Strategy and Leadership, 4(1): 41-47, (2016).
  • [38] Garcia F., Cruz-Lemus J. A., Genero M., Calero C., Piattini M. and Serrano M. A., “Empirical studies in software engineering courses: some pedagogical experiences”, International Journal of Engineering Education, 24(4), (2008).
  • [39] Felix A., Huerta R. and Leyva S., “Management of the technological innovation process in software companies from Sinaloa, Mexico”, Management Dynamics in the Knowledge Economy, 4(2): 193-214, (2016).
  • [40] Storer T., “Bridging the chasm: a survey of software engineering practice in scientific programming”, ACM Computing Surveys, 50(4): 32, (2017).
  • [41] http://www.worldpopulationreview.com/countries/countries-by-gdp, World Population Review, (17.2.2020).
  • [42] Garousi V., Felderer M. and Mäntylä M.V., “Guidelines for including grey literature and conducting multivocal literature reviews in software engineering”, Information and Software Technology, 101-121, (2019).

Deneysel Yazılım Mühendisliğindeki Araştırma Eğilimleri için Metin Madenciliği

Year 2021, , 1227 - 1235, 01.09.2021
https://doi.org/10.2339/politeknik.831391

Abstract

Bu çalışma Deneysel Yazılım Mühendisliği alanında son yirmi yılda ki araştırma eğilimlerini metin madenciliği tekniklerini kullanarak incelemeyi amaçlamaktadır. Makale özetleri göz önünde bulundurularak, Deneysel Yazılım Mühendisliği ile ilgili literatürde yayınlanmış 10658 makale incelenmiştir. İstatistiksel bir modelleme tekniği olan (Latent Dirichlet Allocation) kullanılarak, bu alandaki temel araştırma konuları bulunarak karşılaştırılmalı olarak incelenmiştir. Bu makalede son yirmi yıl içinde yayınlanmış çalışmalardaki odak değişiklikleri değerlendirilmekte ve araştırma içeriğindeki son eğilimler ortaya çıkarılmaktadır. Karşılaştırmalı değerlendirme yoluyla, deneysel yazılım mühendisliği alanındaki araştırma eğilim değişikliği vurgulanarak, hem akademisyenler hem de uygulayıcılar için faydalı olabilecek ve bu alanın ilerlemesini sağlayacak araştırma gündemi önerilmektedir.

References

  • [1] Garousi V. and Mäntylä M., “Citations, research topics and active countries in software engineering: A bibliometrics study”, Computer Science Review, 19: 56-77, (2016).
  • [2] Höfer A. and Tichy W. F., “Status of empirical research in software engineering”, in Basili, V. (Eds.) et al., Empirical Software Engineering Issues, LNCS 4336, 10-19, Springer-Verlag, (2007).
  • [3] Dieste O., Juristo N. and Martínez M.D., “Software industry experiments: A systematic literature review”, 1st International Workshop on Conducting Empirical Studies in Industry (CESI), 2-8, (2013).
  • [4] Basili V., “The Role of Controlled Experiments in Software Engineering Research”, Empirical Software Engineering Issues, LNCS 4336, 33-37, Springer-Verlag, (2007).
  • [5] Jeffery R., Scott L., “Has twenty-five years of empirical software engineering made a difference?”, 9th Asia-Pacific Software Engineering Conference, Australia, 539- 546, (2002).
  • [6] Kitchenham B., “Empirical paradigm - the role of experiments”, International Conference on Empirical Software Engineering Issues: Critical Assessment and Future Direction, 25-32, (2006).
  • [7] Kitchenham B.A., Pfleeger S.L., Pickard L.M., Jones P.W., Hoaglin D.C., El Emam K. and Rosenberg J., “Preliminary guidelines for empirical research in software engineering”, IEEE Transactions on Software Engineering, 28(8):721-734, (2002).
  • [8] Sjoberg D.I.K., Dyba T. and Jorgensen M., “The future of empirical methods in software engineering research”, Future of Software Engineering, Minneapolis, MN, 358-378, (2007).
  • [9] MacDonell S., Shepperd M. and Kitchenham B., “How eeliable are systematic reviews in empirical software engineering?”, IEEE Transactions on Software Engineering, 36(5): 676-687, (2010).
  • [10] Mathew G., Agrawal A. and Menzies T., “Finding trends in software research”, IEEE Transactions on Software Engineering, (2018).https://arxiv.org/pdf/1608.08100.pdf .
  • [11] Dieste O., Grimán A. and Juristo N., “Developing search strategies for detecting relevant experiments”, Empirical Software Engineering, 14(5): 513-539, (2009).
  • [12] Rainer A., “The Value of Empirical Evidence for Practitioners and Researches”, Empirical Software Engineering Issues, LNCS 4336, 24, (2007).
  • [13] Malhotra R., “Empirical Research in Software Engineering, Concepts, Analysis and Applications”, CRC Press, (2016).
  • [14] Calheiros A. C., Moro S. and Rita P., “Sentiment classification of consumer generated online reviews using topic modeling”, Journal of Hospitality Marketing and Management, (2017). http://dx. doi.org /10.1080/19368623.2017.1310075.
  • [15] Wei X. and Croft W. B., “LDA-based document models for ad-hoc re- trieval”, in Proc. of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 178–185, (2006).
  • [16] Bauer S., Noulas A., Séaghdha D.O., Clark S. and Mascolo C., “Talking places: Modelling and analyzing linguistic content in foursquare”, International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2012 International Conference on Social Computing (SocialCom), (2012).
  • [17] Vulic I., De Smet W. and Moens M.F., “Identifying word translations from comparable corpora using latent topic models”, in Proc. of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2, (2011).
  • [18] Chen Y., Rhaad Rabbani M., Gupta A. and Mohammed Zak J., “Comparative text analytics via topic modeling in banking,”, IEEE Symposium Series on Computational Intelligence (SSCI), (2017).
  • [19] Maier D., Waldherr A., Miltner P., Wiedemann G., Niekler A., Keinert A., Pfetsch B., Heyer G., Reber U., Häussler T., Schmid-Petri H. and Adam S., “Applying LDA topic modeling in communication research: Toward a Valid and Reliable Methodology”, Communication Methods and Measures, 12(2-3): 93-118, (2018). [20] Alghamdi R. and Alfalqi K., “A Survey of Topic modeling in text mining”, International Journal of Advanced Computer Science and Applications (IJACSA), 6(1)(2015).
  • [21] Albalawi R., Yeap T.H. and Benyoucef M., “Using topic modeling methods for short-text data: A Comparative Analysis”, Frontiers in Artificial Intelligence, 3:42, (2020).
  • [22] Chakkarwar V. and Tamane S. C., “Quick insight of research literature using topic modeling”, in Smart Trends in Computing and Communications. Smart Innovation, Systems and Technologies, Zhang Y. D., Mandal J., So-In C. and Thakur N. (Eds.), Springer, Singapore, 165: 189–197, (2020).
  • [23] Vayansky I. and Kumar S.A.P., “A review of topic modeling methods”, Information Systems, 94, (2020).
  • [24] Guzman E. and Maalej W., “How do users like this feature? a fine grained sentiment analysis of app reviews,”, IEEE 22nd International Requirements Engineering Conference (RE), IEEE, 153–162, (2014).
  • [25] Thomas S. W., Hemmati H., Hassan A. E. and Blostein D., “Static test case prioritization using topic models,”, Empirical Software Engineering, 19(1): 182–212, (2014).
  • [26] Gethers M. and Poshyvanyk D., “Using relational topic models to capture coupling among classes in object-oriented software systems”, IEEE International Conference on Software Maintenance (ICSM), (2010).
  • [27] Linstead E., Lopes C. and Baldi P., “An application of latent Dirichlet allocation to analyzing software evolution”, 7th International Conference on Machine Learning and Applications, IEEE, (2008).
  • [28] Chen T.H., Thomas S.W., Nagappan M. and Hassan A. E., “Explaining software defects using topic models”, in Proc. of the 9th IEEE Working Conference on Mining Software Repositories, IEEE Press, 189–198, (2012).
  • [29] Grimmer J. and Stewart B.M., “Text as data: The promise and pitfalls of automatic content analysis methods for political texts”, Political Analysis, 1–31.
  • [30] Chen H., Xie L., Leung C-C., Lu X., Ma B. and Li H., “Modeling latent topics and temporal distance for story segmentation of broadcast news”, IEEE/ACM Transactions on Audio, Speech and Language Processing, 25(1):112–123, (2017).
  • [31] Bulut A., “TopicMachine: conversion prediction in search advertising using latent topic models”, IEEE Transactions on Knowledge and Data Engineering, 26(11), (2014).
  • [32] Kitchenham B. and Charters S., “Guidelines for Performing Systematic Literature Reviews in Software Engineering”, Technical Report EBSE 2007-001, Keele University and Durham University Joint Report.
  • [33] Han J., Kamber M. and Pei J., “Data Mining: Concepts and Techniques”, Morgan Kaufmann Series in Data Management Systems, (2011).
  • [34] Kampenes V. B., Anda B. and Dybå T., “Flexibility in research designs in empirical software engineering”, in Proc. of the 12th international conference on Evaluation and Assessment in Software Engineering, 49–57,(2008).
  • [35] Joshi M., Sidhu J. and Ubha D., “Reporting intellectual capital in annual reports in Australian software and IT companies”, Journal of Knowledge Management Practice, 11(3): 1-18, (2010).
  • [36] Döring H., “Methodological approaches for research on intangible resources and competitive success in software companies”, in Proc. of 17th European Conference on Research Methodology for Business and Management Studies, 123-129, (2018).
  • [37] Lesser E. and Ban L., “How leading companies practice software development and delivery to achieve a competitive edge”, Strategy and Leadership, 4(1): 41-47, (2016).
  • [38] Garcia F., Cruz-Lemus J. A., Genero M., Calero C., Piattini M. and Serrano M. A., “Empirical studies in software engineering courses: some pedagogical experiences”, International Journal of Engineering Education, 24(4), (2008).
  • [39] Felix A., Huerta R. and Leyva S., “Management of the technological innovation process in software companies from Sinaloa, Mexico”, Management Dynamics in the Knowledge Economy, 4(2): 193-214, (2016).
  • [40] Storer T., “Bridging the chasm: a survey of software engineering practice in scientific programming”, ACM Computing Surveys, 50(4): 32, (2017).
  • [41] http://www.worldpopulationreview.com/countries/countries-by-gdp, World Population Review, (17.2.2020).
  • [42] Garousi V., Felderer M. and Mäntylä M.V., “Guidelines for including grey literature and conducting multivocal literature reviews in software engineering”, Information and Software Technology, 101-121, (2019).
There are 41 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Research Article
Authors

Gul Tokdemir 0000-0003-2441-3056

Publication Date September 1, 2021
Submission Date November 25, 2020
Published in Issue Year 2021

Cite

APA Tokdemir, G. (2021). Using Text Mining For Research Trends in Empirical Software Engineering. Politeknik Dergisi, 24(3), 1227-1235. https://doi.org/10.2339/politeknik.831391
AMA Tokdemir G. Using Text Mining For Research Trends in Empirical Software Engineering. Politeknik Dergisi. September 2021;24(3):1227-1235. doi:10.2339/politeknik.831391
Chicago Tokdemir, Gul. “Using Text Mining For Research Trends in Empirical Software Engineering”. Politeknik Dergisi 24, no. 3 (September 2021): 1227-35. https://doi.org/10.2339/politeknik.831391.
EndNote Tokdemir G (September 1, 2021) Using Text Mining For Research Trends in Empirical Software Engineering. Politeknik Dergisi 24 3 1227–1235.
IEEE G. Tokdemir, “Using Text Mining For Research Trends in Empirical Software Engineering”, Politeknik Dergisi, vol. 24, no. 3, pp. 1227–1235, 2021, doi: 10.2339/politeknik.831391.
ISNAD Tokdemir, Gul. “Using Text Mining For Research Trends in Empirical Software Engineering”. Politeknik Dergisi 24/3 (September 2021), 1227-1235. https://doi.org/10.2339/politeknik.831391.
JAMA Tokdemir G. Using Text Mining For Research Trends in Empirical Software Engineering. Politeknik Dergisi. 2021;24:1227–1235.
MLA Tokdemir, Gul. “Using Text Mining For Research Trends in Empirical Software Engineering”. Politeknik Dergisi, vol. 24, no. 3, 2021, pp. 1227-35, doi:10.2339/politeknik.831391.
Vancouver Tokdemir G. Using Text Mining For Research Trends in Empirical Software Engineering. Politeknik Dergisi. 2021;24(3):1227-35.
 
TARANDIĞIMIZ DİZİNLER (ABSTRACTING / INDEXING)
181341319013191 13189 13187 13188 18016 

download Bu eser Creative Commons Atıf-AynıLisanslaPaylaş 4.0 Uluslararası ile lisanslanmıştır.