Research Article
BibTex RIS Cite

Stackoverflow gönderilerinde tartışılan trend konuların kelime frekans analizi ile belirlenmesi

Year 2021, , 357 - 368, 15.04.2021
https://doi.org/10.17714/gumusfenbil.811123

Abstract

Günümüzde yazılım geliştiriciler ve bilgisayar bilimciler sıklıkla çevrimiçi bilgi paylaşım platformlarını kullanmaktadır. StackOverflow bu platformların başında gelmektedir. Bu ortamda paylaşılan bilgilerin analizi güncel konu ve eğilimlerin belirlenmesinde önemli çıkarımlar sunabilir. Bu bağlamda yürütülen çalışmada 2019 yılı içerisinde StackOverflow platformu üzerinde paylaşılan gönderilere ait etiketlerin incelenmesi amaçlanmıştır. Kelime frekans analizine dayalı metinsel içerik analizinin yapıldığı bu çalışmada, StackOverflow gönderilerinde kullanılan etiketlerle bir veri seti oluşturulmuş ve bu etiketler içerisinde en çok kullanılanlar, ivmesi artan ve azalan etiketler analiz edilmiştir. Bu analiz sonucunda elde edilen etiketlerden en çok kullanılan elli etiket üzerinden detaylı sonuçlar verilmiştir. En çok kullanılan bu 50 etiketin altı sınıf altında kümelendiği görülmüştür. Çalışmanın sonuçlarına dayanarak başta yenilikçi web, mobil ve ilişkisel olmayan veri tabanı teknolojileri olmak üzere güncel teknolojilerin genel olarak sık kullanıldığı ve ivmelerinin arttığı söylenebilir. Bunun yanında programlama dilleri, araçları ve kütüphanelerinde Python odaklı güçlü bir eğilimin olduğu açıktır. Çalışmanın sonuçlarının başta yazılım geliştiriciler ve bu alanda eğitim alan ve kariyerini bu alanda planlayan bireyler olmak üzere, müfredat yapıcılar ve karar vericilere önemli bilgiler sunması beklenmektedir.

References

  • Aggarwal, C.C. and Zhai, C. (2012). Mining Text Data. New York: Springer Science & Business Media.
  • Ahmed, T. and Srivastava, A. (2017). Understanding and evaluating the behavior of technical users. A study of developer interaction at StackOverflow. Human-centric Computing and Information Sciences, 7(1), 1-18. https://doi.org/10.1186/s13673-017-0091-8
  • Baki̇r, C., Hakkoymaz, V, Di̇ri̇, B. ve Güçlü, M. (2020). Dağıtık veritabanlarında saldırı önleme metotları. Gümüşhane Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 10(2), 425-441.
  • Barua, A., Thomas, S.W. and Hassan, A. E. (2014). What are developers talking about? An analysis of topics and trends in stack overflow. Empirical Software Engineering, 19(3), 619-654. https://doi.org/10.1007/s10664-012-9231-y
  • Cavusoglu, H., Li, Z. and Huang, K.W. (2015). Can gamification motivate voluntary contributions? The case of StackOverflow Q&A community. In Proceedings of the 18th ACM conference companion on computer supported cooperative work & social computing (pp. 171-174). New York.
  • Correa, D. and Sureka, A. (2013). Fit or unfit: analysis and prediction of closed questions on stack overflow. In Proceedings of the first ACM conference on Online social networks (pp. 201-212). Boston.
  • Gurcan, F. (2019). Extraction of core competencies for big data: Implications for competency-based engineering education. International Journal of Engineering Education, 35(4), 1110-1115.
  • Gurcan, F. and Kose, C. (2017). Analysis of software engineering industry needs and trends: Implications for education. International Journal of Engineering Education, 33(4), 1361-1368.
  • Gürcan, F. and Özyurt, Ö. (2019). Analysis of requirements for programming languages and tools in Turkish software industry. 2nd Turkish World Engineering and Science Congress (pp. 307-311). Antalya.
  • Gürcan, F. and Şevik, S. (2019). Expertise roles and skills required by the software development industry. In 2019 1st International Informatics and Software Engineering Conference (UBMYK) (pp. 1-4). Ankara.
  • Gürcan, F. (2009). Web içerik madenciliği ve konu sınıflandırılması. Yüksek Lisans Tezi, Karadeniz Teknik Üniversitesi Fen Bilimleri Enstitüsü, Trabzon.
  • Gürcan, F. (2018). Multi-class classification of Turkish texts with machine learning algorithms. In 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 1-5). Ankara.
  • Johri, V. and Bansal, S. (2018). Identifying trends in technologies and programming languages using topic modeling. 12th International Conference on Semantic Computing (ICSC2018) (pp. 391-396). California.
  • Joorabchi, A., English, M. and Mahdi, A.E. (2016). Text mining stackoverflow: An insight into challenges and subject-related difficulties faced by computer science learners. Journal of Enterprise Information Management, 29(2), 255-275. https://doi.org/10.1108/JEIM-11-2014-0109
  • Lijffijt, J., Papapetrou, P., Puolamäki, K. and Mannila, H. (2011). Analyzing word frequencies in large text corpora using inter-arrival times and bootstrapping. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 341-357). Berlin.
  • Murgia, A., Tourani, P., Adams, B. and Ortu, M. (2014). Do developers feel emotions? An exploratory analysis of emotions in software artifacts. In Proceedings of the 11th working conference on mining software repositories (pp. 262-271). Hyderabad.
  • Nasehi, S.M., Sillito, J., Maurer, F. and Burns, C. (2012). What makes a good code example?: A study of programming Q&A in stackoverflow. In 28th IEEE International Conference on Software Maintenance (ICSM) (pp. 25-34). Trento.
  • Özyurt, Ö. and Özyurt, H. (2010). Forum based learning: Content analysis of asynchronous discussion forums for the teaching of computer programming languages. International Educational Technology Conference (pp. 618-622). İstanbul.
  • Pal, A., Harper, F.M. and Konstan, J.A. (2012). Exploring question selection bias to identify experts and potential experts in community question answering. ACM Transactions on Information Systems, 30(2), 1-28. https://doi.org/10.1145/2180868.2180872
  • Rajput, N.K., Ahuja, B. and Riyal, M.K. (2019). A statistical probe into the word frequency and length distributions prevalent in the translations of Bhagavad Gita. Pramana, 92(4), 1-6. https://doi.org/10.1007/s12043-018-1709-8
  • Rosen, C. and Shihab, E. (2016). What are mobile developers asking about? A large scale study using stack overflow. Empirical Software Engineering, 21(3), 1192-1223. https://doi.org/10.1007/s10664-015-9379-3
  • Stack Exchange API. (2020, May 29). Retrieved from https://api.stackexchange.com/.
  • Xia, X., Lo, D., Wang, X. and Zhou, B. (2013). Tag recommendation in software information sites. In 10th Working Conference on Mining Software Repositories (MSR) (pp. 287-296). San Francisco.
  • Yang, X.L., Lo, D., Xia, X., Wan, Z.Y. and Sun, J.L. (2016). What security questions do developers ask? A large-scale study of stack overflow posts. Journal of Computer Science and Technology, 31(5), 910-924. https://doi.org/10.1007/s11390-016-1672-0
  • Zhang, J. and Zhu, G.B. (2018). Hot topic discovery research of StackOverflow programming website based on CBOW LDA topic model. Computer Science, 45(4), 208-214.

Identification of trend topics discussed in stackoverflow posts by word frequency analysis

Year 2021, , 357 - 368, 15.04.2021
https://doi.org/10.17714/gumusfenbil.811123

Abstract

Today, software developers and computer scientists frequently use online information sharing platforms. StackOverflow is one of these important platforms. The analysis of the information shared in this environment can offer important implications in revealing current issues and trends. In this context, it is aimed to examine the tags of the posts shared on the StackOverflow platform in 2019. In this study, in which textual content analysis based on word frequency analysis was made, a data set was created with the tags used in StackOverflow posts, and the most used tags among these tags, whose acceleration increased and decreased, were analyzed. Detailed results were given on the fifty most used tags among the tags obtained as a result of this analysis. It has been observed that these 50 most used tags are clustered under six classes. Based on the results of the study, it can be said that current technologies, especially innovative web, mobile and non-relational database technologies, are generally used frequently and their acceleration has increased. In addition, it is clear that there is a strong Python-oriented trend in programming languages, tools and libraries. The results of the study are expected to provide important information to curriculum makers and decision-makers, especially software developers and individuals who have received training in this field and plan their career in this field.

References

  • Aggarwal, C.C. and Zhai, C. (2012). Mining Text Data. New York: Springer Science & Business Media.
  • Ahmed, T. and Srivastava, A. (2017). Understanding and evaluating the behavior of technical users. A study of developer interaction at StackOverflow. Human-centric Computing and Information Sciences, 7(1), 1-18. https://doi.org/10.1186/s13673-017-0091-8
  • Baki̇r, C., Hakkoymaz, V, Di̇ri̇, B. ve Güçlü, M. (2020). Dağıtık veritabanlarında saldırı önleme metotları. Gümüşhane Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 10(2), 425-441.
  • Barua, A., Thomas, S.W. and Hassan, A. E. (2014). What are developers talking about? An analysis of topics and trends in stack overflow. Empirical Software Engineering, 19(3), 619-654. https://doi.org/10.1007/s10664-012-9231-y
  • Cavusoglu, H., Li, Z. and Huang, K.W. (2015). Can gamification motivate voluntary contributions? The case of StackOverflow Q&A community. In Proceedings of the 18th ACM conference companion on computer supported cooperative work & social computing (pp. 171-174). New York.
  • Correa, D. and Sureka, A. (2013). Fit or unfit: analysis and prediction of closed questions on stack overflow. In Proceedings of the first ACM conference on Online social networks (pp. 201-212). Boston.
  • Gurcan, F. (2019). Extraction of core competencies for big data: Implications for competency-based engineering education. International Journal of Engineering Education, 35(4), 1110-1115.
  • Gurcan, F. and Kose, C. (2017). Analysis of software engineering industry needs and trends: Implications for education. International Journal of Engineering Education, 33(4), 1361-1368.
  • Gürcan, F. and Özyurt, Ö. (2019). Analysis of requirements for programming languages and tools in Turkish software industry. 2nd Turkish World Engineering and Science Congress (pp. 307-311). Antalya.
  • Gürcan, F. and Şevik, S. (2019). Expertise roles and skills required by the software development industry. In 2019 1st International Informatics and Software Engineering Conference (UBMYK) (pp. 1-4). Ankara.
  • Gürcan, F. (2009). Web içerik madenciliği ve konu sınıflandırılması. Yüksek Lisans Tezi, Karadeniz Teknik Üniversitesi Fen Bilimleri Enstitüsü, Trabzon.
  • Gürcan, F. (2018). Multi-class classification of Turkish texts with machine learning algorithms. In 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) (pp. 1-5). Ankara.
  • Johri, V. and Bansal, S. (2018). Identifying trends in technologies and programming languages using topic modeling. 12th International Conference on Semantic Computing (ICSC2018) (pp. 391-396). California.
  • Joorabchi, A., English, M. and Mahdi, A.E. (2016). Text mining stackoverflow: An insight into challenges and subject-related difficulties faced by computer science learners. Journal of Enterprise Information Management, 29(2), 255-275. https://doi.org/10.1108/JEIM-11-2014-0109
  • Lijffijt, J., Papapetrou, P., Puolamäki, K. and Mannila, H. (2011). Analyzing word frequencies in large text corpora using inter-arrival times and bootstrapping. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 341-357). Berlin.
  • Murgia, A., Tourani, P., Adams, B. and Ortu, M. (2014). Do developers feel emotions? An exploratory analysis of emotions in software artifacts. In Proceedings of the 11th working conference on mining software repositories (pp. 262-271). Hyderabad.
  • Nasehi, S.M., Sillito, J., Maurer, F. and Burns, C. (2012). What makes a good code example?: A study of programming Q&A in stackoverflow. In 28th IEEE International Conference on Software Maintenance (ICSM) (pp. 25-34). Trento.
  • Özyurt, Ö. and Özyurt, H. (2010). Forum based learning: Content analysis of asynchronous discussion forums for the teaching of computer programming languages. International Educational Technology Conference (pp. 618-622). İstanbul.
  • Pal, A., Harper, F.M. and Konstan, J.A. (2012). Exploring question selection bias to identify experts and potential experts in community question answering. ACM Transactions on Information Systems, 30(2), 1-28. https://doi.org/10.1145/2180868.2180872
  • Rajput, N.K., Ahuja, B. and Riyal, M.K. (2019). A statistical probe into the word frequency and length distributions prevalent in the translations of Bhagavad Gita. Pramana, 92(4), 1-6. https://doi.org/10.1007/s12043-018-1709-8
  • Rosen, C. and Shihab, E. (2016). What are mobile developers asking about? A large scale study using stack overflow. Empirical Software Engineering, 21(3), 1192-1223. https://doi.org/10.1007/s10664-015-9379-3
  • Stack Exchange API. (2020, May 29). Retrieved from https://api.stackexchange.com/.
  • Xia, X., Lo, D., Wang, X. and Zhou, B. (2013). Tag recommendation in software information sites. In 10th Working Conference on Mining Software Repositories (MSR) (pp. 287-296). San Francisco.
  • Yang, X.L., Lo, D., Xia, X., Wan, Z.Y. and Sun, J.L. (2016). What security questions do developers ask? A large-scale study of stack overflow posts. Journal of Computer Science and Technology, 31(5), 910-924. https://doi.org/10.1007/s11390-016-1672-0
  • Zhang, J. and Zhu, G.B. (2018). Hot topic discovery research of StackOverflow programming website based on CBOW LDA topic model. Computer Science, 45(4), 208-214.
There are 25 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Fatih Gürcan 0000-0001-9915-6686

Özcan Özyurt 0000-0002-0047-6813

Publication Date April 15, 2021
Submission Date October 15, 2020
Acceptance Date February 9, 2021
Published in Issue Year 2021

Cite

APA Gürcan, F., & Özyurt, Ö. (2021). Stackoverflow gönderilerinde tartışılan trend konuların kelime frekans analizi ile belirlenmesi. Gümüşhane Üniversitesi Fen Bilimleri Dergisi, 11(2), 357-368. https://doi.org/10.17714/gumusfenbil.811123