Sentiment Analysis on Twitter Based on Ensemble of Psychological and Linguistic Feature Sets
Öz
With the advances in information and communication technologies, social media and microblogging platforms serve as an important source of information. In microblogging platforms, people can share their opinions, complaints, sentiments and attitudes towards topics, current issues and products. Sentiment analysis is an important research direction in natural language processing, which aims to identify the sentiment orientation of source materials. Twitter is a popular microblogging platform, where people all over the world can interact by user-generated text messages. Information obtained from Twitter can serve as an essential source for several applications, including event detection, news recommendation and crisis management. In sentiment classification, the identification of an appropriate feature subset plays an important role. LIWC (Linguistic Inquiry and Word Count) is an exploratory text analysis software to extract psycholinguistic features from text documents. In this paper, we present a psycholinguistic approach to sentiment analysis on Twitter. In this scheme, we utilized five main LIWC categories (namely, linguistic processes, psychological processes, personal concerns, spoken categories and punctuation) as feature sets. In the experimental analysis, five LIWC categories and their ensemble combinations are taken into consideration. To explore the predictive performance of different feature engineering schemes, four supervised learning algorithms (namely, Naïve Bayes, support vector machines, k-nearest neighbor algorithm and logistic regression) and three ensemble learning methods (namely, AdaBoost, Bagging and Random Subspace) are utilized. The experimental results indicate that ensemble feature sets yield higher predictive performance compared to the individual feature sets.
Anahtar Kelimeler
Kaynakça
- [1] A. Onan, “Twitter mesajları üzerinde makine öğrenmesi yöntemlerine dayalı duygu analizi”, Yönetim Bilişim Sistemleri Dergisi, Vol. 3, No. 2, 2017, pp. 1-14.
- [2] A. Onan, S. Korukoğlu, and H. Bulut, “A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification”, Expert Systems with Applications, Vol.62, 2016, pp.1-16.
- [3] A.Onan, “A machine learning based approach to identify geo-location of Twitter users”, in Proceedings of the ICC 2017, UK, 2017, pp.1-7.
- [4] J. Mahmud, J. Nichols, and C. Drews, “Home location identification of twitter users”, ACM Transactions on Intelligent Systems and Technology, Vol. 5, No.3, 2014, pp.47.
- [5] Z. Cheng, J. Caverlee, and K.Lee, “You are where you tweet: a content-based approach to geo-location twitter users”, in Proceedings of the 19th ACM International Conference on Information and Knowledge Management, USA, 2010, pp.759-768.
- [6] B.Hecht, L.Hong, B. Suh and E.D.Chi, “Tweets from Justin Bieber’s heart: the dynamics of the location field in user profiles”, in Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, USA, 2011, pp.237-246.
- [7] A. Onan and S. Korukoğlu, “Makine öğrenmesi yöntemlerinin görüş madenciliğinde kullanılması üzerine bir literatür araştırması”, Pamukkale Üniversitesi Mühendislik Bilimleri Dergisi, Vol. 22, No. 2, 2016, pp. 111-122.
- [8] W. Medhat, A. Hassan and H. Korashy, “Sentiment analysis algorithms and applications: a survey”, Ain Shams Engineering Journal, Vol. 5, No. 4, 2014, pp. 1093-1113.
Ayrıntılar
Birincil Dil
İngilizce
Konular
Mühendislik
Bölüm
Araştırma Makalesi
Yazarlar
Yayımlanma Tarihi
30 Nisan 2018
Gönderilme Tarihi
25 Temmuz 2017
Kabul Tarihi
16 Kasım 2017
Yayımlandığı Sayı
Yıl 2018 Cilt: 6 Sayı: 2
Cited By
A Meta-Ensemble Classifier Approach: Random Rotation Forest
Balkan Journal of Electrical and Computer Engineering
https://doi.org/10.17694/bajece.502156Ensemble of Classifiers and Term Weighting Schemes for Sentiment Analysis in Turkish
Scientific Research Communications
https://doi.org/10.52460/src.2021.004A comparative study of keyword extraction algorithms for English texts
Journal of Intelligent Systems
https://doi.org/10.1515/jisys-2021-0040An Incremental Approach to Corpus Design and Construction: Application to a Large Contemporary Saudi Corpus
IEEE Access
https://doi.org/10.1109/ACCESS.2021.3089924The power of ensemble learning in sentiment analysis
Expert Systems with Applications
https://doi.org/10.1016/j.eswa.2021.115819Grade Prediction in Blended Learning Using Multisource Data
Scientific Programming
https://doi.org/10.1155/2021/4513610Improving Arabic Sentiment Analysis Using CNN-Based Architectures and Text Preprocessing
Computational Intelligence and Neuroscience
https://doi.org/10.1155/2021/5538791Predicting Learning Behavior Using Log Data in Blended Teaching
Scientific Programming
https://doi.org/10.1155/2021/4327896Solving Misclassification of the Credit Card Imbalance Problem Using Near Miss
Mathematical Problems in Engineering
https://doi.org/10.1155/2021/7194728Research on Diagnosis Prediction of Traditional Chinese Medicine Diseases Based on Improved Bayesian Combination Model
Evidence-Based Complementary and Alternative Medicine
https://doi.org/10.1155/2021/5513748Automatic Personality Evaluation from Transliterations of YouTube Vlogs Using Classical and State of the art Word Embeddings
Ingeniería e Investigación
https://doi.org/10.15446/ing.investig.93803Aspect Based Opinion Mining on Hotel Reviews
International Journal of Advances in Engineering and Pure Sciences
https://doi.org/10.7240/jeps.896515Arabic sentiment analysis using GCL-based architectures and a customized regularization function
Engineering Science and Technology, an International Journal
https://doi.org/10.1016/j.jestch.2023.101433Dental Impression Tray Selection From Maxillary Arch Images Using Multi-Feature Fusion and Ensemble Classifier
IEEE Access
https://doi.org/10.1109/ACCESS.2021.3059785Equity Research Report-Driven Investment Strategy in Korea Using Binary Classification on Stock Price Direction
IEEE Access
https://doi.org/10.1109/ACCESS.2021.3067691Bayesian Attribute Bagging-Based Extreme Learning Machine for High-Dimensional Classification and Regression
ACM Transactions on Intelligent Systems and Technology
https://doi.org/10.1145/3495164Using the Ship-Gram Model for Japanese Keyword Extraction Based on News Reports
Complexity
https://doi.org/10.1155/2021/9965843Cost-sensitive regression learning on small dataset through intra-cluster product favoured feature selection
Connection Science
https://doi.org/10.1080/09540091.2021.1970719A New Big Data Feature Selection Approach for Text Classification
Scientific Programming
https://doi.org/10.1155/2021/6645345Sentence Classification Using N-Grams in Urdu Language Text
Scientific Programming
https://doi.org/10.1155/2021/1296076A comprehensive review and evaluation on text predictive and entertainment systems
Soft Computing
https://doi.org/10.1007/s00500-021-06691-4Using artificial intelligence techniques for detecting Covid-19 epidemic fake news in Moroccan tweets
Results in Physics
https://doi.org/10.1016/j.rinp.2021.104266A systematic literature review on machine learning applications for consumer sentiment analysis using online reviews
Computer Science Review
https://doi.org/10.1016/j.cosrev.2021.100413A Study of Lightweight Approaches to Analyze Crime Conditions in India
Journal of Applied Security Research
https://doi.org/10.1080/19361610.2021.2006031Rethinking of BERT sentence embedding for text classification
Neural Computing and Applications
https://doi.org/10.1007/s00521-024-10212-3A Robust Context‐Based Deep Learning Approach for Highly Imbalanced Hyperspectral Classification
Computational Intelligence and Neuroscience
https://doi.org/10.1155/2021/9923491Intercultural Attitudes Embedded in Microblogging: Sentiment and Content Analyses of Data from Sina Weibo
Journalism and Media
https://doi.org/10.3390/journalmedia5040092Analysis of Software Developers' Programming Language Preferences and Community Behavior From Big5 Personality Traits
Software: Practice and Experience
https://doi.org/10.1002/spe.3381Fine-Tuning Retrieval-Augmented Generation with an Auto-Regressive Language Model for Sentiment Analysis in Financial Reviews
Applied Sciences
https://doi.org/10.3390/app142310782Users’ Discourse from primarily US-focused subreddits about the Political Image of the Kingdom of Saudi Arabia from 2015 to 2023
Computers in Human Behavior Reports
https://doi.org/10.1016/j.chbr.2024.100543Makine Öğrenmesi ve Derin Öğrenmeye Dayalı Duygu Analizinde Metin Temsil Yöntemlerinin Sınıflandırma Başarımına Etkisinin İncelenmesi
Karadeniz Fen Bilimleri Dergisi
https://doi.org/10.31466/kfbd.1536270The Words of Morality: An Italian Adaptation of the Moral Foundations Dictionary 2.0 for the Linguistic Inquiry Word Count
Journal of Language and Social Psychology
https://doi.org/10.1177/0261927X251386477