Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets
Abstract
Understanding the reason behind the emotions placed in the social media plays a key role to learn mood characterization of any written texts that are not seen before. Knowing how to classify the mood characterization leads this technology to be useful in a variety of fields. The Latent Dirichlet Allocation (LDA), a topic modeling algorithm, was used to determine which emotions the tweets on Twitter had in the study. The dataset consists of 4000 tweets that are categorized into 5 different emotions that are anger, fear, happiness, sadness, and surprise. Zemberek, Snowball, and first 5 letters root extraction methods are used to create models. The generated models were tested by using the proposed n-stage LDA method. With the proposed method, we aimed to increase model’s success rate by decreasing the number of words in the dictionary. Using the multi-stage LDA (2-stages:70.5%, 3-stages:76.375%) method, the success rate was increased compared to normal LDA (60.375%) for 5 class.
Keywords
References
- [1] D. M. Blei, “Probabilistic topic models”, Communications of the ACM, vol. 55, no 4, pp. 77-84, April 2012.
- [2] A. Daud, J. Li, L. Zhou, and F. Muhammad, “Knowledge discovery through directed probabilistic topic models: a survey”, Frontiers of Compute rScience in Chine, vol. 4, no 2, pp. 280-301, June 2010.
- [3] M. Steyvers and T. Griffiths, “Probabilistic topic models”, Handbook of latent semantic analysis, vol. 427, no 7, pp. 424-440, February 2007.
- [4] B. Liu and L. Zhang, “A survey of opinion mining and sentiment analysis”, Mining text data, pp. 415-463, 2012.
- [5] O. Coban, B. Ozyer, and G. T. Ozyer, “Sentiment analysis for Turkish Twitter feeds,” 2015 23nd Signal Processing and Communications Applications Conference (SIU), May 2015.
- [6] H. Türkmen, S. I. Omurca, E. Ekinci,“An Aspect Based Sentiment Analysis on Turkish Hotel Reviews”, Girne American University Journal of Social and Applied Sciences, vol. 6, pp. 9-15, 2016.
- [7] K. Roberts, M. Roach, J. Johnson, J. Guthrie, and S. Harabagiu, “EmpaTweet: Annotating and Detecting Emotions on Twitter”, In Proceedings of the 8th International Conference on Language Resourcesand Evaluation (LREC), May 2012.
- [8] A. Çelikyılmaz, G. Tur, and D. Tur, “LDA Based Similarity Modeling for Question Answering”, Proceedings of the NAACL HLT 2010 Workshop on Semantic Search, pp. 1-9, May 2010.
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Banu Diri
0000-0002-4052-0049
Türkiye
Tolgahan Çakaloğlu
This is me
0000-0002-4711-7287
United States
Publication Date
September 28, 2019
Submission Date
September 12, 2018
Acceptance Date
March 13, 2019
Published in Issue
Year 2019 Volume: 7 Number: 3
Cited By
1964-2022 Yılları Arasında İşletme Ana Bilim Dalı’nda Hazırlanan Tezlerin Gizli Dirichlet Tahsisi Yöntemi ile Konu Modellemesi
Anadolu Üniversitesi Sosyal Bilimler Dergisi
https://doi.org/10.18037/ausbd.1272581