Over the past few years, sentiment analysis has moved from social networking services like LinkedIn, Facebook, YouTube, Twitter, and online product-based reviews to determine public opinion or emotion using social media textual contents. The methodology includes data selection, text pre-processing, feature extraction, classification model, and result analysis. Text pre-processing is an important stage in structuring data for improved performance of our methodology. The feature extraction technique (FET) is a crucial step in sentiment analysis as it is difficult to obtain effective and useful information from highly unstructured social media data. A number of feature extraction techniques are available to extract useful features. In this work, popular feature extraction techniques including bag of words (BOW), term frequency and inverse document frequency (TF-IDF), and Word2vec are compared and analyzed for the sentiment analysis of social media contents. A method is proposed for processing text data from social media networks for sentiment analysis that uses support vector machine as a classifier. The experiments are carried on three datasets of different context namely US Airline, Movie Review, and News from Twitter. The results show that TF-IDF consistently outperformed other techniques with best accuracy of 82.33%, 92.31%, and 99.10% for Airline, Movie Review, and News datasets respectively. It is also found that the proposed method performed better than some existing methods.
I confirm that this work is original and has not been published elsewhere, nor is it currently under for publication elsewhere.
Primary Language | English |
---|---|
Subjects | Automated Software Engineering, Reinforcement Learning |
Journal Section | Articles |
Authors | |
Early Pub Date | October 28, 2024 |
Publication Date | October 31, 2024 |
Submission Date | May 2, 2024 |
Acceptance Date | June 6, 2024 |
Published in Issue | Year 2024 |