Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması

Batuhan Cem Öğe; Fatih Kayaalp

doi:10.29130/dubited.1015320

Araştırma Makalesi

Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması

Yıl 2021, Cilt: 9 Sayı: 6 - ICAIAME 2021, 406 - 416, 31.12.2021

Batuhan Cem Öğe , Fatih Kayaalp

https://doi.org/10.29130/dubited.1015320

Cited By: 3

Öz

Son yıllarda internete erişim imkanlarının artması ve kullanıcılardaki akıllı telefon kullanımının yaygınlaşması sebebiyle sosyal medya olarak adlandırılan ve insanların çeşitli konulardaki fikirlerini paylaştığı servisler çok yaygın olarak kullanılmaktadır. Sosyal medya verilerinin analiz edilmesiyle insanların farklı konulardaki duygularına dair anlamlı çıkarımlarda bulunulması anlamına gelen ve temelde bir sınıflandırma işlemi olan Duygu Analizi çalışmaları son yıllarda öne çıkan çalışma alanlarından biridir. Bu çalışmada, Python programlama dili içindeki kütüphaneler kullanılarak Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) ve Artificial Neural Network (ANN) gibi 6 adet sınıflandırma algoritmasının Duygu Analizi kapsamında, performans karşılaştırması yapılmıştır. Veri seti olarak, açık kaynaklı, IMDB sitesinde yer alan etiketli kullanıcı yorumları kullanılmıştır. Doğal Dil İşleme yöntemleri kullanılarak temizlenen veri setinin sayısal olarak temsil edilebilmesi için Bag of Words (BoW), TF-IDF, FastText ve Word2Vec metin temsil yöntemleri kullanılmıştır. Veri setinin eğitimi ve test edilmesi aşamasında k=5 olacak şekilde k-fold cross validation yöntemi kullanılmıştır. 6 farklı sınıflandırma yöntemi için elde edilen sonuçlar accuracy, precision, recall ve f1 score hesaplanarak ayrıntılı bir karşılaştırma yapılmış ve sonuçlar kaydedilmiştir. En yüksek accuracy değerleri olarak LR ve SVM sırasıyla BOW’da %86, TF-IDF’te %87, word2Vec’de %87 ve FastText’te %83 seviyelerinde benzer sonuçlar vermiştir.

Anahtar Kelimeler

Doğal Dil İşleme, Duygu Analizi, Makine Öğrenmesi, Metin Temsil, Sınıflandırma, Veri Madenciliği

Kaynakça

[1] B. Agarwal, N. Mittal, P. Bansal, and S. Garg, “Sentiment analysis using common-sense and context information,” Computational Intelligence and Neuroscience, vol. 2015, pp. 1–9, 2015.
[2] N. Mishra and C. K. Jha, “Classification of opinion mining techniques,” International Journal of Computer Applications, vol. 56, no. 13, pp. 1–6, 2012.
[3] B. Bansal ve S. Srivastava, “Sentiment classification of online consumer reviews using word vector representations”, Procedia Computer Science, vol. 132, pp. 1147–1153, 2018.
[4] S. Symeonidis, D. Effrosynidis, ve A. Arampatzis, “A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis”, Expert Systems with Applications, vol. 110, pp. 298–310, 2018.
[5] B. Haryanto, Y. Ruldeviyani, F. Rohman, T. N. Julius Dimas, R. Magdalena, ve F. Muhamad Yasil, “Facebook analysis of community sentiment on 2019 Indonesian presidential candidates from Facebook opinion data”, Procedia Computer Science, vol. 161, pp. 715–722, 2019.
[6] E. D’Andrea, P. Ducange, A. Bechini, A. Renda, ve F. Marcelloni, “Monitoring the public opinion about the vaccination topic from tweets analysis”, Expert Systems with Applications, vol. 116, pp. 209–226, 2019.
[7] A. Alsaeedi and M. Z. Khan, “A study on sentiment analysis techniques of Twitter data,” International Journal of Advanced Computer Science and Applications, vol. 10, no. 2, pp. 361–374, 2019.
[8] J. Khairnar and M. Kinikar, “Machine learning algorithms for opinion mining and sentiment classification,” International Journal of Scientific and Research Publications, vol. 3, no. 6, pp. 1–6, 2013.
[9] A. Tyagi and N. Sharma, “Sentiment Analysis using logistic regression and effective word score heuristic,” International Journal of Engineering and Technology (UAE), vol. 7, no. 2, pp. 20–23, 2018.
[10] H. Kaur, V. Mangat, and Nidhi, “A survey of sentiment analysis techniques,” Proceedings of the International Conference on IoT in Social, Mobile, Analytics and Cloud, I-SMAC 2017, 2017, pp. 921–925.
[11] M. M and S. Mehla, “Sentiment analysis of movie reviews using machine learning classifiers,” International Journal of Computer Applications, vol. 182, no. 50, pp. 25–28, 2019.
[12] F. Hemmatian and M. K. Sohrabi, “A survey on classification techniques for opinion mining and sentiment analysis,” Artificial Intelligence Review, vol. 52, no. 3, pp. 1495–1545, 2019.
[13] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142–150, 2011.

Performance Comparison of Different Classification Algorithms and Text Representation Methods in Sentiment Analysis

Yıl 2021, Cilt: 9 Sayı: 6 - ICAIAME 2021, 406 - 416, 31.12.2021

Batuhan Cem Öğe , Fatih Kayaalp

https://doi.org/10.29130/dubited.1015320

Cited By: 3

Öz

Due to the increase in internet access opportunities and the widespread use of smartphones in recent years, services called social media where people share their opinions on various issues are widely used. Sentiment Analysis studies, which means making meaningful inferences about people's emotions on different subjects by analyzing social media data, and which is basically a classification process, is one of the prominent fields of study in recent years. In this study, 6 classification methods such as Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF) and Artificial Neural Network (ANN) were used by using libraries in Python programming language. Within the scope of Sentiment Analysis of the algorithm, performance comparison was made. As the dataset, open source, labeled user comments on the IMDB site were used. Bag of Words (BoW), TF-IDF, FastText and Word2Vec text representation methods were used to represent the data set that was cleaned using Natural Language Processing methods. During the training and testing of the data set, the k-fold cross validation method was used, with k=5. The results obtained for 6 different classification methods were calculated by calculating accuracy, precision, recall and f1 score, and a detailed comparison was made and the results were recorded. As the highest accuracy values, LR and SVM gave similar results at 86% in BOW, 87% in TF-IDF, 87% in word2Vec and 83% in FastText, respectively.

Anahtar Kelimeler

Natural Language Processing, Sentiment Analysis, Machine Learning, Text Representation, Classification, Data Mining

Kaynakça

[1] B. Agarwal, N. Mittal, P. Bansal, and S. Garg, “Sentiment analysis using common-sense and context information,” Computational Intelligence and Neuroscience, vol. 2015, pp. 1–9, 2015.
[2] N. Mishra and C. K. Jha, “Classification of opinion mining techniques,” International Journal of Computer Applications, vol. 56, no. 13, pp. 1–6, 2012.
[3] B. Bansal ve S. Srivastava, “Sentiment classification of online consumer reviews using word vector representations”, Procedia Computer Science, vol. 132, pp. 1147–1153, 2018.
[4] S. Symeonidis, D. Effrosynidis, ve A. Arampatzis, “A comparative evaluation of pre-processing techniques and their interactions for twitter sentiment analysis”, Expert Systems with Applications, vol. 110, pp. 298–310, 2018.
[5] B. Haryanto, Y. Ruldeviyani, F. Rohman, T. N. Julius Dimas, R. Magdalena, ve F. Muhamad Yasil, “Facebook analysis of community sentiment on 2019 Indonesian presidential candidates from Facebook opinion data”, Procedia Computer Science, vol. 161, pp. 715–722, 2019.
[6] E. D’Andrea, P. Ducange, A. Bechini, A. Renda, ve F. Marcelloni, “Monitoring the public opinion about the vaccination topic from tweets analysis”, Expert Systems with Applications, vol. 116, pp. 209–226, 2019.
[7] A. Alsaeedi and M. Z. Khan, “A study on sentiment analysis techniques of Twitter data,” International Journal of Advanced Computer Science and Applications, vol. 10, no. 2, pp. 361–374, 2019.
[8] J. Khairnar and M. Kinikar, “Machine learning algorithms for opinion mining and sentiment classification,” International Journal of Scientific and Research Publications, vol. 3, no. 6, pp. 1–6, 2013.
[9] A. Tyagi and N. Sharma, “Sentiment Analysis using logistic regression and effective word score heuristic,” International Journal of Engineering and Technology (UAE), vol. 7, no. 2, pp. 20–23, 2018.
[10] H. Kaur, V. Mangat, and Nidhi, “A survey of sentiment analysis techniques,” Proceedings of the International Conference on IoT in Social, Mobile, Analytics and Cloud, I-SMAC 2017, 2017, pp. 921–925.
[11] M. M and S. Mehla, “Sentiment analysis of movie reviews using machine learning classifiers,” International Journal of Computer Applications, vol. 182, no. 50, pp. 25–28, 2019.
[12] F. Hemmatian and M. K. Sohrabi, “A survey on classification techniques for opinion mining and sentiment analysis,” Artificial Intelligence Review, vol. 52, no. 3, pp. 1495–1545, 2019.
[13] A. L. Maas, R. E. Daly, P. T. Pham, D. Huang, A. Y. Ng, and C. Potts, “Learning word vectors for sentiment analysis,” ACL-HLT 2011 - Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, vol. 1, pp. 142–150, 2011.

Toplam 13 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Batuhan Cem Öğe 0000-0001-5347-3352 Fatih Kayaalp 0000-0002-8752-3335
Yayımlanma Tarihi	31 Aralık 2021
Yayımlandığı Sayı	Yıl 2021 Cilt: 9 Sayı: 6 - ICAIAME 2021

Kaynak Göster

APA	Öğe, B. C., & Kayaalp, F. (2021). Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması. Duzce University Journal of Science and Technology, 9(6), 406-416. https://doi.org/10.29130/dubited.1015320
AMA	Öğe BC, Kayaalp F. Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması. DÜBİTED. Aralık 2021;9(6):406-416. doi:10.29130/dubited.1015320
Chicago	Öğe, Batuhan Cem, ve Fatih Kayaalp. “Farklı Sınıflandırma Algoritmaları Ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması”. Duzce University Journal of Science and Technology 9, sy. 6 (Aralık 2021): 406-16. https://doi.org/10.29130/dubited.1015320.
EndNote	Öğe BC, Kayaalp F (01 Aralık 2021) Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması. Duzce University Journal of Science and Technology 9 6 406–416.
IEEE	B. C. Öğe ve F. Kayaalp, “Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması”, DÜBİTED, c. 9, sy. 6, ss. 406–416, 2021, doi: 10.29130/dubited.1015320.
ISNAD	Öğe, Batuhan Cem - Kayaalp, Fatih. “Farklı Sınıflandırma Algoritmaları Ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması”. Duzce University Journal of Science and Technology 9/6 (Aralık 2021), 406-416. https://doi.org/10.29130/dubited.1015320.
JAMA	Öğe BC, Kayaalp F. Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması. DÜBİTED. 2021;9:406–416.
MLA	Öğe, Batuhan Cem ve Fatih Kayaalp. “Farklı Sınıflandırma Algoritmaları Ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması”. Duzce University Journal of Science and Technology, c. 9, sy. 6, 2021, ss. 406-1, doi:10.29130/dubited.1015320.
Vancouver	Öğe BC, Kayaalp F. Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması. DÜBİTED. 2021;9(6):406-1.

Cited By

Classification and Analysis of Employee Feedback with Deep Learning Algorithms

Sakarya University Journal of Computer and Information Sciences

https://doi.org/10.35377/saucis...1627619

An Analysis of Intelligent Turkish Text Classification Models for Routing Calls in Call Centers: A Case Study on the Republic of Turkiye Ministry of Trade Call Center

Düzce Üniversitesi Bilim ve Teknoloji Dergisi

Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması

Öz

Anahtar Kelimeler

Kaynakça

Performance Comparison of Different Classification Algorithms and Text Representation Methods in Sentiment Analysis

Öz

Anahtar Kelimeler

Kaynakça

Ayrıntılar

Kaynak Göster

Cited By

Classification and Analysis of Employee Feedback with Deep Learning Algorithms

Sakarya University Journal of Computer and Information Sciences

https://doi.org/10.35377/saucis...1627619

An Analysis of Intelligent Turkish Text Classification Models for Routing Calls in Call Centers: A Case Study on the Republic of Turkiye Ministry of Trade Call Center

Sakarya University Journal of Computer and Information Sciences

https://doi.org/10.35377/saucis...1402414

A natural language processing framework for analyzing public transportation user satisfaction: a case study

Journal of Innovative Transportation

https://doi.org/10.53635/jit.1274928