Yıl 2019, Cilt 7 , Sayı 3, Sayfalar 467 - 472 2019-09-28

Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets
Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets

Zekeriya Anıl Güven [1] , Banu Diri [2] , Tolgahan Çakaloğlu [3]


Understanding the reason behind the emotions placed in the social media plays a key role to learn mood characterization of any written texts that are not seen before. Knowing how to classify the mood characterization leads this technology to be useful in a variety of fields. The Latent Dirichlet Allocation (LDA), a topic modeling algorithm, was used to determine which emotions the tweets on Twitter had in the study. The dataset consists of 4000 tweets that are categorized into 5 different emotions that are anger, fear, happiness, sadness, and surprise. Zemberek, Snowball, and first 5 letters root extraction methods are used to create models. The generated models were tested by using the proposed n-stage LDA method. With the proposed method, we aimed to increase model’s success rate by decreasing the number of words in the dictionary. Using the multi-stage LDA (2-stages:70.5%, 3-stages:76.375%) method, the success rate was increased compared to normal LDA (60.375%) for 5 class.

Understanding the reason behind the emotions placed in the social media plays a key role to learn mood characterization of any written texts that are not seen before. Knowing how to classify the mood characterization leads this technology to be useful in a variety of fields. The Latent Dirichlet Allocation (LDA), a topic modeling algorithm, was used to determine which emotions the tweets on Twitter had in the study. The dataset consists of 4000 tweets that are categorized into 5 different emotions that are anger, fear, happiness, sadness, and surprise. Zemberek, Snowball, and first 5 letters root extraction methods are used to create models. The generated models were tested by using the proposed n-stage LDA method. With the proposed method, we aimed to increase model’s success rate by decreasing the number of words in the dictionary. By using the multi-stages LDA, we were able to perform better (2-stages:70.5%, 3-stages:76.4%) than the state of the art result (60.4%) which was achieved using the plain LDA for 5 classes.
  • [1] D. M. Blei, “Probabilistic topic models”, Communications of the ACM, vol. 55, no 4, pp. 77-84, April 2012.
  • [2] A. Daud, J. Li, L. Zhou, and F. Muhammad, “Knowledge discovery through directed probabilistic topic models: a survey”, Frontiers of Compute rScience in Chine, vol. 4, no 2, pp. 280-301, June 2010.
  • [3] M. Steyvers and T. Griffiths, “Probabilistic topic models”, Handbook of latent semantic analysis, vol. 427, no 7, pp. 424-440, February 2007.
  • [4] B. Liu and L. Zhang, “A survey of opinion mining and sentiment analysis”, Mining text data, pp. 415-463, 2012.
  • [5] O. Coban, B. Ozyer, and G. T. Ozyer, “Sentiment analysis for Turkish Twitter feeds,” 2015 23nd Signal Processing and Communications Applications Conference (SIU), May 2015.
  • [6] H. Türkmen, S. I. Omurca, E. Ekinci,“An Aspect Based Sentiment Analysis on Turkish Hotel Reviews”, Girne American University Journal of Social and Applied Sciences, vol. 6, pp. 9-15, 2016.
  • [7] K. Roberts, M. Roach, J. Johnson, J. Guthrie, and S. Harabagiu, “EmpaTweet: Annotating and Detecting Emotions on Twitter”, In Proceedings of the 8th International Conference on Language Resourcesand Evaluation (LREC), May 2012.
  • [8] A. Çelikyılmaz, G. Tur, and D. Tur, “LDA Based Similarity Modeling for Question Answering”, Proceedings of the NAACL HLT 2010 Workshop on Semantic Search, pp. 1-9, May 2010.
  • [9] G. Tur, A. Celikyilmaz, and D. Hakkani-Tur, “Latent semantic modeling for slot filling in conversational understanding,” 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2013.
  • [10] P. Paroubek and A. Pak, “Twitter as a Corpus for Sentiment Analysis and Opinion Mining”, Proceedings of the International Conference on Language Resourcesand Evaluation, pp. 17-23, Malta, May 2010.
  • [11] C. Lin and Y. He, “Joint sentiment/topic model for sentiment analysis,” Proceeding of the 18th ACM conference on Information and knowledge management - CIKM 09, pp. 375–384, Nov. 2009.
  • [12] R. Chatterjee and S. Agarwal, “Twitter Truths: Authenticating Analysis of Information Credibility”, 2016 3rd International Conference on Computing for Sustainable Global Development, March 2016.
  • [13] A. Ratku, S. Feuerriegel, and D. Neumann, “Analysis of How Underlying Topics in Financial News Affect Stock Prices Using Latent Dirichlet Allocation,” SSRN Electronic Journal, pp. 1072–1081, Jan. 2014.
  • [14] C. Strapparava and R. Mihalcea, “SemEval-2007 task 14,” Proceedings of the 4th International Workshop on Semantic Evaluations - SemEval 07, pp. 70–74, Jun. 2007.
  • [15] F. Colace, M. D. Santo, and L. Greco, “A Probabilistic Approach to Tweets Sentiment Classification,” 2013 Humaine Association Conference on Affective Computing and Intelligent Interaction, pp. 37–42, Sep. 2013.
  • [16] A. Onan, “Türkçe Twitter Mesajlarında Gizli Dirichlet Tahsisine Dayalı Duygu Analizi”, Akademik Bilişim Konferansı, Feb. 2017.
  • [17] D. M. Blei, A. Y. Ng, and M. I. Jordan, “Latent Dirichlet Allocation”, Journal of Machine Learning Research, vol. 3, pp. 993-1022, March 2003.
  • [18] L. Bolelli, Ş. Ertekin, and C. L. Giles, “Topic and Trend Detection in Text Collections Using Latent Dirichlet Allocation,” Lecture Notes in Computer Science Advances in Information Retrieval, pp. 776–780, Apr. 2009.
  • [19] J. Barber, “Latent Dirichlet Allocation (LDA) with Python,” Human Activity Recognition Using Smartphones Data Set. [Online]. Available: https://rstudio-pubs-static.s3.amazonaws.com/79360_850b2a69980c4488b1db95987a24867a.html. [Accessed: 12-Sep-2017].
  • [20] wikizero.net. [Online]. Available: http://www.wikizero.net/index.php?q=aHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvTGF0ZW50X0RpcmljaGxldF9hbGxvY2F0aW9u. [Accessed: 20-Oct-2017].
  • [21] “Zemberek NLP,” Zemberek NLP. [Online]. Available: http://zembereknlp.blogspot.com/. [Accessed: 05-Oct-2017].
  • [22] “Download,” Snowball. [Online]. Available: http://snowball.tartarus.org/download.html. [Accessed: 16-Nov-2017].
Birincil Dil en
Konular Mühendislik
Yayımlanma Tarihi Eylül 2019
Bölüm Makaleler
Yazarlar

Orcid: 0000-0002-7025-2815
Yazar: Zekeriya Anıl Güven (Sorumlu Yazar)
Kurum: COMPUTER ENGINEERING (DR)
Ülke: Turkey


Orcid: 0000-0002-4052-0049
Yazar: Banu Diri
Kurum: DEPARTMENT OF COMPUTER ENGINEERING
Ülke: Turkey


Orcid: 0000-0002-4711-7287
Yazar: Tolgahan Çakaloğlu
Kurum: University of Arkansas, Computer Science and Computer Engineering Department
Ülke: United States


Tarihler

Yayımlanma Tarihi : 28 Eylül 2019

Bibtex @araştırma makalesi { apjes459447, journal = {Akademik Platform Mühendislik ve Fen Bilimleri Dergisi}, issn = {}, eissn = {2147-4575}, address = {}, publisher = {Akademik Platform}, year = {2019}, volume = {7}, pages = {467 - 472}, doi = {10.21541/apjes.459447}, title = {Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets}, key = {cite}, author = {Güven, Zekeriya Anıl and Diri, Banu and Çakaloğlu, Tolgahan} }
APA Güven, Z , Diri, B , Çakaloğlu, T . (2019). Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets. Akademik Platform Mühendislik ve Fen Bilimleri Dergisi , 7 (3) , 467-472 . DOI: 10.21541/apjes.459447
MLA Güven, Z , Diri, B , Çakaloğlu, T . "Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets". Akademik Platform Mühendislik ve Fen Bilimleri Dergisi 7 (2019 ): 467-472 <https://dergipark.org.tr/tr/pub/apjes/issue/44190/459447>
Chicago Güven, Z , Diri, B , Çakaloğlu, T . "Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets". Akademik Platform Mühendislik ve Fen Bilimleri Dergisi 7 (2019 ): 467-472
RIS TY - JOUR T1 - Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets AU - Zekeriya Anıl Güven , Banu Diri , Tolgahan Çakaloğlu Y1 - 2019 PY - 2019 N1 - doi: 10.21541/apjes.459447 DO - 10.21541/apjes.459447 T2 - Akademik Platform Mühendislik ve Fen Bilimleri Dergisi JF - Journal JO - JOR SP - 467 EP - 472 VL - 7 IS - 3 SN - -2147-4575 M3 - doi: 10.21541/apjes.459447 UR - https://doi.org/10.21541/apjes.459447 Y2 - 2019 ER -
EndNote %0 Akademik Platform Mühendislik ve Fen Bilimleri Dergisi Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets %A Zekeriya Anıl Güven , Banu Diri , Tolgahan Çakaloğlu %T Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets %D 2019 %J Akademik Platform Mühendislik ve Fen Bilimleri Dergisi %P -2147-4575 %V 7 %N 3 %R doi: 10.21541/apjes.459447 %U 10.21541/apjes.459447
ISNAD Güven, Zekeriya Anıl , Diri, Banu , Çakaloğlu, Tolgahan . "Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets". Akademik Platform Mühendislik ve Fen Bilimleri Dergisi 7 / 3 (Eylül 2019): 467-472 . https://doi.org/10.21541/apjes.459447
AMA Güven Z , Diri B , Çakaloğlu T . Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets. APJES. 2019; 7(3): 467-472.
Vancouver Güven Z , Diri B , Çakaloğlu T . Emotion Detection with n-stage Latent Dirichlet Allocation for Turkish Tweets. Akademik Platform Mühendislik ve Fen Bilimleri Dergisi. 2019; 7(3): 472-467.