Research Article
BibTex RIS Cite

Analysis of Skills and Qualifications Required in Data Scientist Job Postings Based on the Pareto Analysis Perspective Using Text Mining

Year 2023, Issue: 39, 10 - 25, 27.12.2023
https://doi.org/10.26650/ekoist.2023.39.1256697

Abstract

Today, there are more job posts than ever before, making it incredibly challenging for job searchers to find the position that best suits them. To overcome this difficulty, text mining methods can be used to extract information such as job titles, required skills, and required experience, and to analyze job postings. This information can also be used to match job seekers with the most relevant job postings. The main purpose of this research is to determine which skills, techniques, subjects, fields, and so on should be prioritized by job seekers. For this purpose, 200 data scientist job postings from Turkey and 200 data scientist job postings from the USA are analyzed. According to the results, employers who have announced their interest in hiring a Data Scientist prefer people who are experts in Machine Learning, Data Science, Python, SQL, R, Statistics, and Mathematics, people with BSc, MSc, and PhD education levels, people with 3+ years of work experience, and people who know Visualization, Data Mining, Prediction, NLP, and Clustering techniques. For this reason, it is recommended that people who want to become data scientists in TR or the USA improve themselves in these techniques, skills, and experiences to be accepted to data scientist position jobs more easily.

References

  • Agarwal, S., Godbole, S., Punjani, D., & Roy, S. (2007). How Much Noise Is Too Much: A Study in Automatic Text Classification. Seventh IEEE International Conference on Data Mining (ICDM 2007), 3-12. https://doi.org/10.1109/ICDM.2007.21. google scholar
  • Aggarwal, C. C. (2018). Machine Learning for Text. Springer International Publishing. https://doi.org/10.1007/978-3-319-73531-3. google scholar
  • Aizawa, A. (2003). An Information-Theoretic Perspective of tf-idf Measures. Information Processing & Management, 39(1), 45-65. https://doi.org/10.1016/S0306-4573(02)00021-3. google scholar
  • Aleryani, A. (2020). A Data Analysis Perspective by the Business Analyst and Data Scientist. International Journal of Scientific and Research Publications (USRP), 10(9), 234-243. https://doi.org/10.29322/IJSRP.10.09.2020.p10525. google scholar
  • Alexander, Melvin. (2018). Word Clouds Using R: Pareto Analysis for Text. Software Quality Professional, 21(1), 48-50. google scholar
  • Al-Shammari, E., & Lin, J. (2008). A Novel Arabic Lemmatization Algorithm. Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data, 113-118. https://doi.org/10.1145/1390749.1390767. google scholar
  • Alzate, M., Arce-Urriza, M., & Cebollada, J. (2022). Mining the Text of Online Consumer Reviews to Analyze Brand Image and Brand Positioning. Journal of Retailing and Consumer Services, 67, 102989. https://doi.org/10.1016/j.jretconser.2022.102989. google scholar
  • Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., Nagappan, N., Nushi, B., & Zimmermann, T. (2019). Software Engineering for Machine Learning: A Case Study. 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 291-300. https://doi.org/10.1109/ICSE-SEIP.2019.00042. google scholar
  • Anandarajan, M., Hill, C., & Nolan, T. (2019). Text Preprocessing. In: Practical Text Analytics. Advances in Analytics and Data Science, Springer, Cham. 2, 45-59. https://doi.org/10.1007/978-3-319-95663-3_4 google scholar
  • Benchimol, J., Kazinnik, S., & Saadon, Y. (2022). Text Mining Methodologies with R: An Application to Central Bank Texts. Machine Learning with Applications, 8, 100286. https://doi.org/10.1016/j.mlwa.2022.100286. google scholar
  • Botelho, B., Laskowski, N., & Fitzgibbond, L. (2019). What is a Data Scientist? What Do They Do? TechTarget. [Available online at: https://www.techtarget.com/searchenterpriseai/definition/data-scientist], Retrieved on February 15, 2022. google scholar
  • Chatterjee, S. (2020). Why Does This Entity Matter? Finding Support Passages for Entities in Search. University of New Hampshire. google scholar
  • Christian, H., Agus, M. P., & Suhartono, D. (2016). Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF). ComTech: Computer, Mathematics and Engineering Applications, 7(4), 285. https://doi.org/10.21512/comtech.v7i4.3746. google scholar
  • Cooper, A., McLoughlin, I. P., & Campbell, K. M. (2000). Sexuality in Cyberspace: Update for The 21st Century. Cyber Psychology & Behavior, 3(4), 521-536. google scholar
  • Costa, C., & Santos, M. Y. (2017). The Data Scientist Profile and Its Representativeness in the European e-Competence Frame-work and the Skills Framework for the Information Age. International Journal of Information Management, 37(6), 726-734. https://doi.org/10.1016Zj.yinfomgt.2017.07.010. google scholar
  • Çelik, S. (2019). Understanding Data Science. Journal of Current Research on Social Sciences, 9(3), 235-256. google scholar
  • Das, D. (2019). Social Media Sentiment Analysis Using Machine Learning: Part — II. Towards Data Science. [Available online at: https://towardsdatascience.com/social-media-sentiment-analysis-part-ii-bcacca5aaa39], Retrieved on February 11, 2022. google scholar
  • de Miranda Santo, M., Coelho, G. M., dos Santos, D. M., & Filho, L. F. (2006). Text Mining as a Valuable Tool in Foresight Exercises: A study on Nanotechnology. Technological Forecasting and Social Change, 73(8), 1013-1027. https://doi.org/10.1016/j.techfore.2006.05.020. google scholar
  • Donders, A. R. T., van der Heyden, G. J. M. G., Stynen, T., & Moons, K. G. M. (2006). Review: A Gentle Introduction to Imputation of Missing Values. Journal of Clinical Epidemiology, 59(10), 1087-1091. https://doi.org/10.1016/j.jclinepi.2006.01.014. google scholar
  • Ergüt, Ö. (2021). Metin Madenciliği Yaklaşımıyla İşverenlerin Nitelik Taleplerinin İncelenmesi. İstanbul Ticaret Üniversitesi Sosyal Bilimler Dergisi. https://doi.org/10.46928/iticusbe.763191. google scholar
  • Grimmer, J., & Stewart, B. M. (2013). Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts. Political Analysis, 21(3), 267-297. https://doi.org/10.1093/pan/mps028. google scholar
  • Halwani, M. A., Amirkiaee, S. Y., Evangelopoulos, N., & Prybutok, V. (2022). Job Qualifications Study for Data Science and Big Data Professions. Information Technology & People, 35(2), 510-525. https://doi.org/10.1108/ITP-04-2020-0201. google scholar
  • Işığıçok, E. (2020). Toplam Kalite Yönetimi Bakış Açısıyla İstatistiksel Kalite Kontrol (3. Baskı). Sigma Akademi Yayınevi. google scholar
  • Jain, V. I. P. I. N., Malviya, B. I. N. D. O. O., & Arya, S. A. T. Y. E. N. D. R. A. (2021). An Overview of Electronic Commerce (e-Commerce). Journal of Contemporary Issues in Business and Government, 27(3), 665-670. google scholar
  • King, G., Lam, P., & Roberts, M. E. (2017). Computer-Assisted Keyword and Document Set Discovery from Unstructured Text. American Journal of Political Science, 61(4), 971-988. https://doi.org/10.1111/ajps.12291. google scholar
  • Krallinger, M., & Valencia, A. (2005). Text-Mining and Information-Retrieval Services for Molecular Biology. Genome Biology, 6(7), 224. https://doi.org/10.1186/gb-2005-6-7-224. google scholar
  • Kuhn, P., & Mansour, H. (2014). Is Internet Job Search Still Ineffective?. The Economic Journal, 124(581), 1213-1233. google scholar
  • Li, D., Wang, S., & Mei, Z. (2010). Approximate Address Matching. 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 264-269. https://doi.org/10.1109/3PGCIC.2010.43. google scholar
  • Martinez-Plumed, F., & Hernandez-Orallo, J. (2021). Project-Based Learning for Scaffolding Data Scientists’ Skills. 2021 16th International Conference on Computer Science & Education (ICCSE), 758-763. https://doi.org/10.1109/ICCSE51940.2021.9569289. google scholar
  • Meyer, M. A. (2019). Healthcare Data Scientist Qualifications, Skills, and Job Focus: A Content Analysis of Job Postings. Journal of the American Medical Informatics Association, 26(5), 383-391. https://doi.org/10.1093/jamia/ocy181. google scholar
  • Monnappa, A. (2022). Why Data Science Matters and How It Powers Business in 2022. Simplilearn. [Available online at: https://www.simplilearn.com/why-and-how-data-science-matters-to-business-article], Retrieved on November 18, 2022. google scholar
  • Muller, M., Lange, I., Wang, D., Piorkowski, D., Tsay, J., Liao, Q. V., Dugan, C., & Erickson, T. (2019). How Data Science Workers Work with Data. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1-15. https://doi.org/10.1145/3290605.3300356. google scholar
  • National Research Council. (2008). Research on Future Skill Demands: A Workshop Summary. National Academies Press. https://doi.org/10.17226/12066. google scholar
  • Noh, H., Jo, Y., & Lee, S. (2015). Keyword Selection and Processing Strategy for Applying Text Mining to Patent Analysis. Expert Systems with Applications, 42(9), 4348-4360. https://doi.org/10.1016/j.eswa.2015.01.050. google scholar
  • O’Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M., & Ananiadou, S. (2015). Using Text Mining for Study Identification in Systematic Reviews: A Systematic Review of Current Approaches. Systematic Reviews, 4(1), 5. https://doi.org/10.1186/2046-4053-4-5. google scholar
  • Quirchmayr, T., Paech, B., Kohl, R., & Karey, H. (2017). Semi-Automatic Software Feature-Relevant Information Extraction from Natural Lan-guage User Manuals. In Requirements Engineering: Foundation for Software Quality: 23rd International Working Conference, REFSQ 2017, Essen, Germany, February 27-March 2, 2017, Proceedings 23, 255-272. Springer International Publishing. https://doi.org/10.1007/978-3-319-54045-0_19. google scholar
  • Prüfer, J., & Prüfer, P. (2020). Data Science for Entrepreneurship Research: Studying Demand Dynamics for Entrepreneurial Skills in the Netherlands. Small Business Economics, 55, 651-672. google scholar
  • Radovilsky, Z., Hegde, V., Acharya, A., & Uma, U. (2018). Skills Requirements of Business Data Analytics and Data Science Jobs: A Comparative Analysis. Journal of Supply Chain and Operations Management, 16(1), 82-101. google scholar
  • Rajkumar, N., Subashini, T. S., Rajan, K., & Ramalingam, V. (2020). Tamil Stopword Removal Based on Term Frequency, 21-30. https://doi.org/10.1007/978-981-15-1097-7_3. google scholar
  • Rey-Ares, L., Kind, P., Viegas, M., Zarate, V., Gianneo, O., de Souza Noronha, K., Fernandez, G., & Augustovski, F. (2017). Which Are The Most Common Quality of Life Health States In Latin America? A Pareto Analysis of A Collaborative Project Using Euroqol Eq-5d In Argentina, Brazil, Chile and Uruguay. Value in Health, 20(9). https://doi.org/10.1016/j.jval.2017.08.2812. google scholar
  • Sakib, S. N. (2022). Data Visualization in Data Science. [Available online at: https://www.cambridge.org/engage/api-gateway/coe/assets/orp/resource/item/626bc5baef2ade3a51419ce1/original/data-visualization-in-data-science.pdf], Retrieved on November 20, 2022. google scholar
  • Severson, K. A., Attia, P. M., Jin, N., Perkins, N., Jiang, B., Yang, Z., Chen, M. H., Aykol, M., Herring, P. K., Fraggedakis, D., Bazant, M. Z., Harris, S. J., Chueh, W. C., & Braatz, R. D. (2019). Data-Driven Prediction of Battery Cycle Life Before Capacity Degradation. Nature Energy, 4(5), 383-391. https://doi.org/10.1038/s41560-019-0356-8. google scholar
  • Singh, H. (2020). Data Preprocessing in Depth. Towards Data Science. [Available online at: https://towardsdatascience.com/data-preprocessing-e2b0bed4c7fb], Retrieved on February 11, 2022. google scholar
  • Siva. (2015). Introduction to Hadoop Streaming. [Available online at: https://hadooptutorial.info/introduction-to-hadoop-streaming/2/], Retrieved on February 25, 2022. google scholar
  • Tandel, S. S., Jamadar, A., & Dudugu, S. (2019). A Survey on Text Mining Techniques. 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS), 1022-1026. https://doi.org/10.1109/ICACCS.2019.8728547. google scholar
  • Teece, D. J. (2010). Business Models, Business Strategy and Innovation. Long Range Planning, 43(2-3), 172-194. https://doi.org/10.1016/j.lrp.2009.07.003. google scholar
  • The Kaleidoscope Garden. (2020). Nltk Lemmatizer Not Working. The Kaleidoscope Garden. [Available online at: https://thekaleidoscopegarden.org/bncvhvy/nltk-lemmatizer-not-working.html], Retrieved on April 12, 2022. google scholar
  • Usuga-Cadavid, J. P., Lamouri, S., Grabot, B., & Fortin, A. (2021). Using Deep Learning to Value Free-Form Text Data for Predictive Maintenance. International Journal of Production Research, 1-28. https://doi.org/10.1080/00207543.2021.1951868. google scholar
  • Vardarlier, P. (2020). Digital Transformation of Human Resource Management: Digital Applications and Strategic Tools in HRM. Digital Business Strategies in Blockchain Ecosystems: Transformational Design and Future of Global Business, 239-264. google scholar
  • Verma, A., Yurov, K. M., Lane, P. L., & Yurova, Y. V. (2019). An Investigation of Skill Requirements for Business and Data Analytics Positions: A Content Analysis of Job Advertisements. Journal of Education for Business, 94(4), 243-250. https://doi.org/10.1080/08832323.2018.1520685. google scholar
  • Vicario, G., & Coleman, S. (2020). A Review of Data Science in Business and Industry and A Future View. Applied Stochastic Models in Business and Industry, 36(1), 6-18. google scholar
  • Washington Durr, A. K. (2020). A Text Analysis of Data-Science Career Opportunities and US iSchool Curriculum. Journal of Education for Library and Information Science, 61(2), 270-293. https://doi.org/10.3138/jelis.2018-0067. google scholar
  • Yordanov, V. (2018). Introduction to Natural Language Processing for Text. Towards Data Science. [Available online at: https://towardsdatascience.com/introduction-to-natural-language-processing-for-text-df845750fb63], Retrieved on February 11, 2022. google scholar
Year 2023, Issue: 39, 10 - 25, 27.12.2023
https://doi.org/10.26650/ekoist.2023.39.1256697

Abstract

References

  • Agarwal, S., Godbole, S., Punjani, D., & Roy, S. (2007). How Much Noise Is Too Much: A Study in Automatic Text Classification. Seventh IEEE International Conference on Data Mining (ICDM 2007), 3-12. https://doi.org/10.1109/ICDM.2007.21. google scholar
  • Aggarwal, C. C. (2018). Machine Learning for Text. Springer International Publishing. https://doi.org/10.1007/978-3-319-73531-3. google scholar
  • Aizawa, A. (2003). An Information-Theoretic Perspective of tf-idf Measures. Information Processing & Management, 39(1), 45-65. https://doi.org/10.1016/S0306-4573(02)00021-3. google scholar
  • Aleryani, A. (2020). A Data Analysis Perspective by the Business Analyst and Data Scientist. International Journal of Scientific and Research Publications (USRP), 10(9), 234-243. https://doi.org/10.29322/IJSRP.10.09.2020.p10525. google scholar
  • Alexander, Melvin. (2018). Word Clouds Using R: Pareto Analysis for Text. Software Quality Professional, 21(1), 48-50. google scholar
  • Al-Shammari, E., & Lin, J. (2008). A Novel Arabic Lemmatization Algorithm. Proceedings of the Second Workshop on Analytics for Noisy Unstructured Text Data, 113-118. https://doi.org/10.1145/1390749.1390767. google scholar
  • Alzate, M., Arce-Urriza, M., & Cebollada, J. (2022). Mining the Text of Online Consumer Reviews to Analyze Brand Image and Brand Positioning. Journal of Retailing and Consumer Services, 67, 102989. https://doi.org/10.1016/j.jretconser.2022.102989. google scholar
  • Amershi, S., Begel, A., Bird, C., DeLine, R., Gall, H., Kamar, E., Nagappan, N., Nushi, B., & Zimmermann, T. (2019). Software Engineering for Machine Learning: A Case Study. 2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 291-300. https://doi.org/10.1109/ICSE-SEIP.2019.00042. google scholar
  • Anandarajan, M., Hill, C., & Nolan, T. (2019). Text Preprocessing. In: Practical Text Analytics. Advances in Analytics and Data Science, Springer, Cham. 2, 45-59. https://doi.org/10.1007/978-3-319-95663-3_4 google scholar
  • Benchimol, J., Kazinnik, S., & Saadon, Y. (2022). Text Mining Methodologies with R: An Application to Central Bank Texts. Machine Learning with Applications, 8, 100286. https://doi.org/10.1016/j.mlwa.2022.100286. google scholar
  • Botelho, B., Laskowski, N., & Fitzgibbond, L. (2019). What is a Data Scientist? What Do They Do? TechTarget. [Available online at: https://www.techtarget.com/searchenterpriseai/definition/data-scientist], Retrieved on February 15, 2022. google scholar
  • Chatterjee, S. (2020). Why Does This Entity Matter? Finding Support Passages for Entities in Search. University of New Hampshire. google scholar
  • Christian, H., Agus, M. P., & Suhartono, D. (2016). Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF). ComTech: Computer, Mathematics and Engineering Applications, 7(4), 285. https://doi.org/10.21512/comtech.v7i4.3746. google scholar
  • Cooper, A., McLoughlin, I. P., & Campbell, K. M. (2000). Sexuality in Cyberspace: Update for The 21st Century. Cyber Psychology & Behavior, 3(4), 521-536. google scholar
  • Costa, C., & Santos, M. Y. (2017). The Data Scientist Profile and Its Representativeness in the European e-Competence Frame-work and the Skills Framework for the Information Age. International Journal of Information Management, 37(6), 726-734. https://doi.org/10.1016Zj.yinfomgt.2017.07.010. google scholar
  • Çelik, S. (2019). Understanding Data Science. Journal of Current Research on Social Sciences, 9(3), 235-256. google scholar
  • Das, D. (2019). Social Media Sentiment Analysis Using Machine Learning: Part — II. Towards Data Science. [Available online at: https://towardsdatascience.com/social-media-sentiment-analysis-part-ii-bcacca5aaa39], Retrieved on February 11, 2022. google scholar
  • de Miranda Santo, M., Coelho, G. M., dos Santos, D. M., & Filho, L. F. (2006). Text Mining as a Valuable Tool in Foresight Exercises: A study on Nanotechnology. Technological Forecasting and Social Change, 73(8), 1013-1027. https://doi.org/10.1016/j.techfore.2006.05.020. google scholar
  • Donders, A. R. T., van der Heyden, G. J. M. G., Stynen, T., & Moons, K. G. M. (2006). Review: A Gentle Introduction to Imputation of Missing Values. Journal of Clinical Epidemiology, 59(10), 1087-1091. https://doi.org/10.1016/j.jclinepi.2006.01.014. google scholar
  • Ergüt, Ö. (2021). Metin Madenciliği Yaklaşımıyla İşverenlerin Nitelik Taleplerinin İncelenmesi. İstanbul Ticaret Üniversitesi Sosyal Bilimler Dergisi. https://doi.org/10.46928/iticusbe.763191. google scholar
  • Grimmer, J., & Stewart, B. M. (2013). Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts. Political Analysis, 21(3), 267-297. https://doi.org/10.1093/pan/mps028. google scholar
  • Halwani, M. A., Amirkiaee, S. Y., Evangelopoulos, N., & Prybutok, V. (2022). Job Qualifications Study for Data Science and Big Data Professions. Information Technology & People, 35(2), 510-525. https://doi.org/10.1108/ITP-04-2020-0201. google scholar
  • Işığıçok, E. (2020). Toplam Kalite Yönetimi Bakış Açısıyla İstatistiksel Kalite Kontrol (3. Baskı). Sigma Akademi Yayınevi. google scholar
  • Jain, V. I. P. I. N., Malviya, B. I. N. D. O. O., & Arya, S. A. T. Y. E. N. D. R. A. (2021). An Overview of Electronic Commerce (e-Commerce). Journal of Contemporary Issues in Business and Government, 27(3), 665-670. google scholar
  • King, G., Lam, P., & Roberts, M. E. (2017). Computer-Assisted Keyword and Document Set Discovery from Unstructured Text. American Journal of Political Science, 61(4), 971-988. https://doi.org/10.1111/ajps.12291. google scholar
  • Krallinger, M., & Valencia, A. (2005). Text-Mining and Information-Retrieval Services for Molecular Biology. Genome Biology, 6(7), 224. https://doi.org/10.1186/gb-2005-6-7-224. google scholar
  • Kuhn, P., & Mansour, H. (2014). Is Internet Job Search Still Ineffective?. The Economic Journal, 124(581), 1213-1233. google scholar
  • Li, D., Wang, S., & Mei, Z. (2010). Approximate Address Matching. 2010 International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 264-269. https://doi.org/10.1109/3PGCIC.2010.43. google scholar
  • Martinez-Plumed, F., & Hernandez-Orallo, J. (2021). Project-Based Learning for Scaffolding Data Scientists’ Skills. 2021 16th International Conference on Computer Science & Education (ICCSE), 758-763. https://doi.org/10.1109/ICCSE51940.2021.9569289. google scholar
  • Meyer, M. A. (2019). Healthcare Data Scientist Qualifications, Skills, and Job Focus: A Content Analysis of Job Postings. Journal of the American Medical Informatics Association, 26(5), 383-391. https://doi.org/10.1093/jamia/ocy181. google scholar
  • Monnappa, A. (2022). Why Data Science Matters and How It Powers Business in 2022. Simplilearn. [Available online at: https://www.simplilearn.com/why-and-how-data-science-matters-to-business-article], Retrieved on November 18, 2022. google scholar
  • Muller, M., Lange, I., Wang, D., Piorkowski, D., Tsay, J., Liao, Q. V., Dugan, C., & Erickson, T. (2019). How Data Science Workers Work with Data. Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1-15. https://doi.org/10.1145/3290605.3300356. google scholar
  • National Research Council. (2008). Research on Future Skill Demands: A Workshop Summary. National Academies Press. https://doi.org/10.17226/12066. google scholar
  • Noh, H., Jo, Y., & Lee, S. (2015). Keyword Selection and Processing Strategy for Applying Text Mining to Patent Analysis. Expert Systems with Applications, 42(9), 4348-4360. https://doi.org/10.1016/j.eswa.2015.01.050. google scholar
  • O’Mara-Eves, A., Thomas, J., McNaught, J., Miwa, M., & Ananiadou, S. (2015). Using Text Mining for Study Identification in Systematic Reviews: A Systematic Review of Current Approaches. Systematic Reviews, 4(1), 5. https://doi.org/10.1186/2046-4053-4-5. google scholar
  • Quirchmayr, T., Paech, B., Kohl, R., & Karey, H. (2017). Semi-Automatic Software Feature-Relevant Information Extraction from Natural Lan-guage User Manuals. In Requirements Engineering: Foundation for Software Quality: 23rd International Working Conference, REFSQ 2017, Essen, Germany, February 27-March 2, 2017, Proceedings 23, 255-272. Springer International Publishing. https://doi.org/10.1007/978-3-319-54045-0_19. google scholar
  • Prüfer, J., & Prüfer, P. (2020). Data Science for Entrepreneurship Research: Studying Demand Dynamics for Entrepreneurial Skills in the Netherlands. Small Business Economics, 55, 651-672. google scholar
  • Radovilsky, Z., Hegde, V., Acharya, A., & Uma, U. (2018). Skills Requirements of Business Data Analytics and Data Science Jobs: A Comparative Analysis. Journal of Supply Chain and Operations Management, 16(1), 82-101. google scholar
  • Rajkumar, N., Subashini, T. S., Rajan, K., & Ramalingam, V. (2020). Tamil Stopword Removal Based on Term Frequency, 21-30. https://doi.org/10.1007/978-981-15-1097-7_3. google scholar
  • Rey-Ares, L., Kind, P., Viegas, M., Zarate, V., Gianneo, O., de Souza Noronha, K., Fernandez, G., & Augustovski, F. (2017). Which Are The Most Common Quality of Life Health States In Latin America? A Pareto Analysis of A Collaborative Project Using Euroqol Eq-5d In Argentina, Brazil, Chile and Uruguay. Value in Health, 20(9). https://doi.org/10.1016/j.jval.2017.08.2812. google scholar
  • Sakib, S. N. (2022). Data Visualization in Data Science. [Available online at: https://www.cambridge.org/engage/api-gateway/coe/assets/orp/resource/item/626bc5baef2ade3a51419ce1/original/data-visualization-in-data-science.pdf], Retrieved on November 20, 2022. google scholar
  • Severson, K. A., Attia, P. M., Jin, N., Perkins, N., Jiang, B., Yang, Z., Chen, M. H., Aykol, M., Herring, P. K., Fraggedakis, D., Bazant, M. Z., Harris, S. J., Chueh, W. C., & Braatz, R. D. (2019). Data-Driven Prediction of Battery Cycle Life Before Capacity Degradation. Nature Energy, 4(5), 383-391. https://doi.org/10.1038/s41560-019-0356-8. google scholar
  • Singh, H. (2020). Data Preprocessing in Depth. Towards Data Science. [Available online at: https://towardsdatascience.com/data-preprocessing-e2b0bed4c7fb], Retrieved on February 11, 2022. google scholar
  • Siva. (2015). Introduction to Hadoop Streaming. [Available online at: https://hadooptutorial.info/introduction-to-hadoop-streaming/2/], Retrieved on February 25, 2022. google scholar
  • Tandel, S. S., Jamadar, A., & Dudugu, S. (2019). A Survey on Text Mining Techniques. 2019 5th International Conference on Advanced Computing & Communication Systems (ICACCS), 1022-1026. https://doi.org/10.1109/ICACCS.2019.8728547. google scholar
  • Teece, D. J. (2010). Business Models, Business Strategy and Innovation. Long Range Planning, 43(2-3), 172-194. https://doi.org/10.1016/j.lrp.2009.07.003. google scholar
  • The Kaleidoscope Garden. (2020). Nltk Lemmatizer Not Working. The Kaleidoscope Garden. [Available online at: https://thekaleidoscopegarden.org/bncvhvy/nltk-lemmatizer-not-working.html], Retrieved on April 12, 2022. google scholar
  • Usuga-Cadavid, J. P., Lamouri, S., Grabot, B., & Fortin, A. (2021). Using Deep Learning to Value Free-Form Text Data for Predictive Maintenance. International Journal of Production Research, 1-28. https://doi.org/10.1080/00207543.2021.1951868. google scholar
  • Vardarlier, P. (2020). Digital Transformation of Human Resource Management: Digital Applications and Strategic Tools in HRM. Digital Business Strategies in Blockchain Ecosystems: Transformational Design and Future of Global Business, 239-264. google scholar
  • Verma, A., Yurov, K. M., Lane, P. L., & Yurova, Y. V. (2019). An Investigation of Skill Requirements for Business and Data Analytics Positions: A Content Analysis of Job Advertisements. Journal of Education for Business, 94(4), 243-250. https://doi.org/10.1080/08832323.2018.1520685. google scholar
  • Vicario, G., & Coleman, S. (2020). A Review of Data Science in Business and Industry and A Future View. Applied Stochastic Models in Business and Industry, 36(1), 6-18. google scholar
  • Washington Durr, A. K. (2020). A Text Analysis of Data-Science Career Opportunities and US iSchool Curriculum. Journal of Education for Library and Information Science, 61(2), 270-293. https://doi.org/10.3138/jelis.2018-0067. google scholar
  • Yordanov, V. (2018). Introduction to Natural Language Processing for Text. Towards Data Science. [Available online at: https://towardsdatascience.com/introduction-to-natural-language-processing-for-text-df845750fb63], Retrieved on February 11, 2022. google scholar
There are 53 citations in total.

Details

Primary Language English
Subjects Econometrics (Other)
Journal Section RESEARCH ARTICLE
Authors

Erkan Işığıçok 0000-0003-4037-0869

Sadullah Çelik 0000-0001-5468-475X

Dilek Özdemir Yılmaz 0000-0002-0548-0694

Publication Date December 27, 2023
Submission Date February 26, 2023
Published in Issue Year 2023 Issue: 39

Cite

APA Işığıçok, E., Çelik, S., & Özdemir Yılmaz, D. (2023). Analysis of Skills and Qualifications Required in Data Scientist Job Postings Based on the Pareto Analysis Perspective Using Text Mining. EKOIST Journal of Econometrics and Statistics(39), 10-25. https://doi.org/10.26650/ekoist.2023.39.1256697