Over the past few years, sentiment analysis has moved from social networking services like LinkedIn, Facebook, YouTube, Twitter, and online product-based reviews to determine public opinion or emotion using social media textual contents. The methodology includes data selection, text pre-processing, feature extraction, classification model, and result analysis. Text pre-processing is an important stage in structuring data for improved performance of our methodology. The feature extraction technique (FET) is a crucial step in sentiment analysis as it is difficult to obtain effective and useful information from highly unstructured social media data. A number of feature extraction techniques are available to extract useful features. In this work, popular feature extraction techniques including bag of words (BOW), term frequency and inverse document frequency (TF-IDF), and Word2vec are compared and analyzed for the sentiment analysis of social media contents. A method is proposed for processing text data from social media networks for sentiment analysis that uses support vector machine as a classifier. The experiments are carried on three datasets of different context namely US Airline, Movie Review, and News from Twitter. The results show that TF-IDF consistently outperformed other techniques with best accuracy of 82.33%, 92.31%, and 99.10% for Airline, Movie Review, and News datasets respectively. It is also found that the proposed method performed better than some existing methods.
I confirm that this work is original and has not been published elsewhere, nor is it currently under for publication elsewhere.
Birincil Dil | İngilizce |
---|---|
Konular | Otomatik Yazılım Mühendisliği, Pekiştirmeli Öğrenme |
Bölüm | Articles |
Yazarlar | |
Erken Görünüm Tarihi | 28 Ekim 2024 |
Yayımlanma Tarihi | 31 Ekim 2024 |
Gönderilme Tarihi | 2 Mayıs 2024 |
Kabul Tarihi | 6 Haziran 2024 |
Yayımlandığı Sayı | Yıl 2024 Cilt: 8 Sayı: 4 |