Araştırma Makalesi
BibTex RIS Kaynak Göster

Occupation Prediction from Twitter Data

Yıl 2025, Cilt: 27 Sayı: 80, 267 - 271, 23.05.2025
https://doi.org/10.21205/deufmd.2025278013

Öz

Today, the use of social media has become quite widespread. Among social media platforms, Twitter, now known as X, stands out with its number of users and abundance of data. This data can be used in many studies. In this study, it is aimed to predict occupation based on Turkish tweets. In the study, 5 datasets of different sizes were used. The tweets are evaluated and compared as single and pairwise. In the pre-processing step, different machine learning and deep learning methods and pre-trained models were tested using 2 different natural language processing libraries. Among the machine learning methods, the highest accuracy of 88% was obtained from the Logistic Regression model with pairwise tweet data, while the highest accuracy of 88% was obtained with the Multi-layer Perceptron from deep learning models. The BERT and "ytu-ce-cosmos/turkish-base-bert-uncased" developed by Yıldız Technical University COSMOS AI Research Team were used as pre-trained models. Although these models gave different results on different datasets, both of them achieved the highest success with a ratio of 89% on pairwise tweet data.

Kaynakça

  • [1] Smart Insights. 2024. Global social media statistics research summary May 2024. https://www.smartinsights.com/social-media-marketing/social-media-strategy/new-global-social-media-research/(Accessed: 2024-05-01).
  • [2] Backlinko. 2024. X (Twitter) Statistics: How Many People Use X? https://www.statista.com/statistics/303681/twitter-users-worldwide/ (Accessed: 2024-05-24).
  • [3] Preoţiuc-Pietro, D., Lampos, V., Aletras, N. 2015. An analysis of the user occupational class through Twitter content, in: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1754-1764.
  • [4] Hu, T., Xiao, H., Luo, J., Nguyen, T. 2016. What the language you tweet says about your occupation, in: Proceedings of the International AAAI Conference on Web and Social Media, Vol. 10, No. 1, pp. 181-190.
  • [5] Aletras, N., Chamberlain, B. P. 2018. Predicting Twitter user socioeconomic attributes with network and language information, in: Proceedings of the 29th on Hypertext and Social Media, pp. 20-24.
  • [6] Pan, J., Bhardwaj, R., Lu, W., Chieu, H. L., Pan, X., Puay, N. Y. 2019. Twitter homophily: Network based prediction of user's occupation, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2633-2638.
  • [7] Kern, M. L., McCarthy, P. X., Chakrabarty, D., Rizoiu, M. 2019. Social media-predicted personality traits and values can help match people to their ideal jobs, Proc. Natl. Acad. Sci. USA, Vol. 116, No. 52, pp. 26459-26464.
  • [8] Zainab, K., Srivastava, G., Mago, V. 2021. Identifying health related occupations of Twitter users through word embedding and deep neural networks, BMC Bioinformatics, Vol. 22, Suppl 10, p. 630.
  • [9] Devlin, J., Chang, M., Lee, K., Toutanova, K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1, pp. 4171-4186. DOI: 10.18653/v1/N19-1423.
  • [10] Mayda, İ. 2022. Türkçe Tweetlerden Makine Öğrenmesi ile Meslek Tahmini, Avrupa Bilim ve Teknoloji Dergisi, Vol. 40, pp. 55-60.
  • [11] Yan, S., Zhao, T., Deng, J. 2022. Predicting social media user occupation with content-aware hierarchical neural networks, in: 2022 8th International Conference on Big Data and Information Analytics (BigDIA), pp. 388-395.
  • [12] Vassef, S., Toosi, R., Akhaee, M. A. 2022. Job title prediction from tweets using word embedding and deep neural networks, in: Proceedings of the 2022 30th International Conference on Electrical Engineering (ICEE), pp. 577-581.
  • [13] Liu, X., Peng, B., Wu, M., Wang, M., Cai, H., Huang, Q. 2024. Occupation prediction with multimodal learning from tweet messages and Google Street View images, AGILE: GIScience Ser., Vol. 5, p. 36.
  • [14] Ciplak, Z., Yildiz, K. 2024. Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach, Expert Syst. Appl., p. 124175.
  • [15] Mayda, I. 2022. Occupation Dataset in Turkish. https://github.com/imayda/occupation-dataset-in-turkish(Accessed: 2024-05-10).
  • [16] Bulat, O. 2020. Zeyrek: Morphological Analyzer and Lemmatizer. https://github.com/obulat/zeyrek (Accessed: 2024-05-25).
  • [17] Akın, A. A. 2014. Zemberek-NLP. https://github.com/ahmetaa/zemberek-nlp (Accessed: 2024-05-25).
  • [18] Kesgin, H. T., Yuce, M. K., Amasyali, M. F. 2023. Developing and evaluating tiny to medium-sized Turkish BERT models, arXiv preprint arXiv:2307.14134.
  • [19] YTU COSMOS AI Research Group. 2024. https://ce.yildiz.edu.tr/genel-sayfa/tr/cosmosrg (Accessed: 2024-07-08).

Twitter Verilerinden Meslek Tahmini

Yıl 2025, Cilt: 27 Sayı: 80, 267 - 271, 23.05.2025
https://doi.org/10.21205/deufmd.2025278013

Öz

Günümüzde sosyal medya kullanımı oldukça yaygınlaşmıştır. Sosyal medya platformları arasında artık X olarak bilinen Twitter, kullanıcı sayısı ve veri fazlalığı özellikleriyle öne çıkıyor. Bu veriler pek çok çalışmada kullanılmaya müsaittir. Bu çalışmada Türkçe tivitler üzerinden meslek tahmini yapılması hedeflenmektedir. Çalışmada farklı boyutlarda 5 adet veri seti kullanılmıştır. Tivitler tekli ve ikili olarak değerlendirilerek karşılaştırması yapılmıştır. Ön işleme adımında 2 farklı doğal dil işleme kütüphanesi kullanılarak farklı makine öğrenmesi ve derin öğrenme metotları ve hazır modeller test edilmiştir. Makine öğrenmesi yöntemlerinden en yüksek başarı ikili tivit verileri ile %88 oranında Lojistik Regresyon modelinden alınırken derin öğrenme modellerinden Çok Katmanlı Algılayıcı ile yine %88'lik başarı elde edilmiştir. Hazır model olarak BERT ve Yıldız Teknik Üniversitesi COSMOS AI Araştırma Ekibi tarafından geliştirilen "ytu-ce-cosmos/turkish-base-bert-uncased" modeli kullanılmıştır. Bu modeller farklı veri setlerinde farklı sonuçlar vermelerine rağmen her ikisi de en yüksek başarısını ikili tivit verilerinde %89 oranı ile elde etmişlerdir.

Kaynakça

  • [1] Smart Insights. 2024. Global social media statistics research summary May 2024. https://www.smartinsights.com/social-media-marketing/social-media-strategy/new-global-social-media-research/(Accessed: 2024-05-01).
  • [2] Backlinko. 2024. X (Twitter) Statistics: How Many People Use X? https://www.statista.com/statistics/303681/twitter-users-worldwide/ (Accessed: 2024-05-24).
  • [3] Preoţiuc-Pietro, D., Lampos, V., Aletras, N. 2015. An analysis of the user occupational class through Twitter content, in: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 1754-1764.
  • [4] Hu, T., Xiao, H., Luo, J., Nguyen, T. 2016. What the language you tweet says about your occupation, in: Proceedings of the International AAAI Conference on Web and Social Media, Vol. 10, No. 1, pp. 181-190.
  • [5] Aletras, N., Chamberlain, B. P. 2018. Predicting Twitter user socioeconomic attributes with network and language information, in: Proceedings of the 29th on Hypertext and Social Media, pp. 20-24.
  • [6] Pan, J., Bhardwaj, R., Lu, W., Chieu, H. L., Pan, X., Puay, N. Y. 2019. Twitter homophily: Network based prediction of user's occupation, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2633-2638.
  • [7] Kern, M. L., McCarthy, P. X., Chakrabarty, D., Rizoiu, M. 2019. Social media-predicted personality traits and values can help match people to their ideal jobs, Proc. Natl. Acad. Sci. USA, Vol. 116, No. 52, pp. 26459-26464.
  • [8] Zainab, K., Srivastava, G., Mago, V. 2021. Identifying health related occupations of Twitter users through word embedding and deep neural networks, BMC Bioinformatics, Vol. 22, Suppl 10, p. 630.
  • [9] Devlin, J., Chang, M., Lee, K., Toutanova, K. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Vol. 1, pp. 4171-4186. DOI: 10.18653/v1/N19-1423.
  • [10] Mayda, İ. 2022. Türkçe Tweetlerden Makine Öğrenmesi ile Meslek Tahmini, Avrupa Bilim ve Teknoloji Dergisi, Vol. 40, pp. 55-60.
  • [11] Yan, S., Zhao, T., Deng, J. 2022. Predicting social media user occupation with content-aware hierarchical neural networks, in: 2022 8th International Conference on Big Data and Information Analytics (BigDIA), pp. 388-395.
  • [12] Vassef, S., Toosi, R., Akhaee, M. A. 2022. Job title prediction from tweets using word embedding and deep neural networks, in: Proceedings of the 2022 30th International Conference on Electrical Engineering (ICEE), pp. 577-581.
  • [13] Liu, X., Peng, B., Wu, M., Wang, M., Cai, H., Huang, Q. 2024. Occupation prediction with multimodal learning from tweet messages and Google Street View images, AGILE: GIScience Ser., Vol. 5, p. 36.
  • [14] Ciplak, Z., Yildiz, K. 2024. Occupational groups prediction in Turkish Twitter data by using machine learning algorithms with multinomial approach, Expert Syst. Appl., p. 124175.
  • [15] Mayda, I. 2022. Occupation Dataset in Turkish. https://github.com/imayda/occupation-dataset-in-turkish(Accessed: 2024-05-10).
  • [16] Bulat, O. 2020. Zeyrek: Morphological Analyzer and Lemmatizer. https://github.com/obulat/zeyrek (Accessed: 2024-05-25).
  • [17] Akın, A. A. 2014. Zemberek-NLP. https://github.com/ahmetaa/zemberek-nlp (Accessed: 2024-05-25).
  • [18] Kesgin, H. T., Yuce, M. K., Amasyali, M. F. 2023. Developing and evaluating tiny to medium-sized Turkish BERT models, arXiv preprint arXiv:2307.14134.
  • [19] YTU COSMOS AI Research Group. 2024. https://ce.yildiz.edu.tr/genel-sayfa/tr/cosmosrg (Accessed: 2024-07-08).
Toplam 19 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Konular Dağıtık Bilgi İşleme ve Sistem Yazılımı (Diğer)
Bölüm Araştırma Makalesi
Yazarlar

Tolga İzdaş 0009-0007-3643-3990

Hikmet İskifoğlu 0009-0004-5397-6194

Banu Diri 0000-0002-6652-4339

Erken Görünüm Tarihi 12 Mayıs 2025
Yayımlanma Tarihi 23 Mayıs 2025
Gönderilme Tarihi 21 Ağustos 2024
Kabul Tarihi 18 Eylül 2024
Yayımlandığı Sayı Yıl 2025 Cilt: 27 Sayı: 80

Kaynak Göster

APA İzdaş, T., İskifoğlu, H., & Diri, B. (2025). Occupation Prediction from Twitter Data. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi, 27(80), 267-271. https://doi.org/10.21205/deufmd.2025278013
AMA İzdaş T, İskifoğlu H, Diri B. Occupation Prediction from Twitter Data. DEUFMD. Mayıs 2025;27(80):267-271. doi:10.21205/deufmd.2025278013
Chicago İzdaş, Tolga, Hikmet İskifoğlu, ve Banu Diri. “Occupation Prediction from Twitter Data”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 27, sy. 80 (Mayıs 2025): 267-71. https://doi.org/10.21205/deufmd.2025278013.
EndNote İzdaş T, İskifoğlu H, Diri B (01 Mayıs 2025) Occupation Prediction from Twitter Data. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 27 80 267–271.
IEEE T. İzdaş, H. İskifoğlu, ve B. Diri, “Occupation Prediction from Twitter Data”, DEUFMD, c. 27, sy. 80, ss. 267–271, 2025, doi: 10.21205/deufmd.2025278013.
ISNAD İzdaş, Tolga vd. “Occupation Prediction from Twitter Data”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 27/80 (Mayıs2025), 267-271. https://doi.org/10.21205/deufmd.2025278013.
JAMA İzdaş T, İskifoğlu H, Diri B. Occupation Prediction from Twitter Data. DEUFMD. 2025;27:267–271.
MLA İzdaş, Tolga vd. “Occupation Prediction from Twitter Data”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi, c. 27, sy. 80, 2025, ss. 267-71, doi:10.21205/deufmd.2025278013.
Vancouver İzdaş T, İskifoğlu H, Diri B. Occupation Prediction from Twitter Data. DEUFMD. 2025;27(80):267-71.

Dokuz Eylül Üniversitesi, Mühendislik Fakültesi Dekanlığı Tınaztepe Yerleşkesi, Adatepe Mah. Doğuş Cad. No: 207-I / 35390 Buca-İZMİR.