GenSent: Improving Sentiment Analysis Using Genetic Algorithm-Based Ensemble Optimization
Öz
Social media platforms are currently the primary medium of all types of communication from personal interactions, and opinion sharing to the dissemination of important international news. However, the ever-increasing amount of user-generated textual information coupled with the dynamic nature of the language, subtle or hidden nuances in expressions used, and contextual dependencies in text, renders timely and accurate sentiment analysis increasingly challenging. Sentiment analysis is an important task in its own right and is also used as the first step of many other classification tasks such as hate speech and misinformation detection. A significant portion of research on sentiment analysis and opinion mining has concentrated on categorizing social media content into three classifications: positive, negative, or neutral. However, despite their importance across numerous practical domains, the classification of extreme opinions, such as highly negative and highly positive sentiments, has only recently gained attention. To address this gap, we propose a framework, GenSent, a novel genetic algorithm-based optimization framework for sentiment classification. Unlike traditional methods that are often tailored to specific datasets, GenSent provides a versatile framework applicable to diverse sentiment analysis tasks from binary, ternary, and fine-grained 5-point scale classification that represents extreme sentiments as well. Through the use of a diverse pool of classifiers including support vector machines, Naïve Bayes, Logistic Regression, Decision Trees, Random Forests, and Stochastic Gradient Descent Algorithms, GenSent effectively builds a robust ensemble without any intervention. The framework is evaluated using binary, ternary, and fine-grained sentiment analysis datasets, namely, SemEval-2017 (Sentiment Analysis in Twitter) task (4A, 4B, and 4C) and Stanford Sentiment Treebank (SST-2 and SST-5). The performance of the proposed framework is compared with other existing well-known methods in the field using the same datasets. Comparative results demonstrate that GenSent outperforms existing methods, achieving significant improvements in sentiment classification across various metrics while reducing the computational complexity.
Anahtar Kelimeler
Kaynakça
- [1] Alarifi, A., Alsaleh, M., and Al-Salman, A., “Twitter turing test: Identifying social machines”, Information Sciences, 372: 332-346, (2016).
- [2] Öztürk, N., and Ayvaz, S., “Sentiment analysis on Twitter: A text mining approach to the Syrian refugee crisis.”, Telematics and Informatics, 35(1): 136-147, (2018).
- [3] Liu, B., “Sentiment analysis and opinion mining.”, Synthesis lectures on human language technologies, 5(1): 1-167, (2012).
- [4] Kour, H., and Gupta, M. K., “Hybrid evolutionary intelligent network for sentiment analysis using Twitter data during COVID‐19 pandemic.”, Expert Systems, 41(3): e13489, (2024).
- [5] Bird, S., Klein, E., and Loper, E., "Natural language processing with Python: analyzing text with the natural language toolkit.”, O’Reilly Media, Inc., (2009).
- [6] Pang, B., Lee, L., and Vaithyanathan, S., “Thumbs up?: sentiment classification using machine learning techniques.”, In Proceedings of the ACL-02 conference on Empirical methods in natural language processing- Association for Computational Linguistics, 10: 79-86, (2002).
- [7] Turney, P. D., “Thumbs up or thumbs down?: semantic orientation applied to unsupervised classification of reviews.”, In Proceedings of the 40th annual meeting on association for computational linguistics- Association for Computational Linguistics, 417-424, (2002).
- [8] Stoyanov, V., & Cardie, C., “Topic identification for fine-grained opinion analysis.”, In Proceedings of the 22nd International Conference on Computational Linguistics, Coling, 817-824, (2008).
Ayrıntılar
Birincil Dil
İngilizce
Konular
Makine Öğrenme (Diğer), Bilgisayar Sistem Yazılımı
Bölüm
Araştırma Makalesi
Erken Görünüm Tarihi
26 Ekim 2025
Yayımlanma Tarihi
29 Mart 2026
Gönderilme Tarihi
25 Mayıs 2025
Kabul Tarihi
28 Eylül 2025
Yayımlandığı Sayı
Yıl 2026 Cilt: 29 Sayı: 3