Research Article
BibTex RIS Cite

Sentiment analysis with ensemble and machine learning methods in multi-domain datasets

Year 2023, , 141 - 148, 15.04.2023
https://doi.org/10.31127/tuje.1079698

Abstract

The first place to get ideas on all the activities considered to occur in everyday life was the comments on the websites. This is an area that deals with these interpretations in the natural language processing, which is a sub-branch of artificial intelligence. Sentiment analysis studies, which is a task of natural language processing are carried out to give people an idea and even guide them with such comments. In this study, sentiment analysis was implemented on public user feedback on websites in two different areas. TripAdvisor dataset includes positive or negative user comments about hotels. And Rotten Tomatoes dataset includes positive (fresh) or negative (rotten) user comments about films. Sentiments analysis on datasets have been carried out by using Word2Vec word embedding model, which learns the vector representations of each word containing the positive or negative meaning of the sentences, and the Term Frequency Inverse Document Frequency text representation model with four machine learning methods (Naïve Bayes-NB, Support Vector Machines-SVM, Logistic Regression-LR, K-Nearest Neighbour-kNN) and two ensemble learning methods (Stacking, Majority Voting-MV). Accuracy and F-measure is used as a performance metric experiments. According to the results, Ensemble learning methods have shown better results than single machine learning algorithms. Among the overall approaches, MV outperformed Stacking.

Project Number

yok

References

  • Mostafa, L. (2020). Machine learning-based sentiment analysis for analyzing the travelers reviews on Egyptian hotels. In Joint European-US Workshop on Applications of Invariance in Computer Vision. Springer, Cham, 405-413.
  • Dehkharghani, R., Yanikoglu, B., Tapucu, D., & Saygin, Y. (2012). Adaptation and Use of Subjectivity Lexicons for Domain Dependent Sentiment Classification. IEEE 12th International Conference on Data Mining Workshops, 10 December, Washington, 669–673.
  • Raut, V. B., & Londhe, D. D. (2014). Opinion Mining and Summarization of Hotel Reviews. International Conference on Computational Intelligence and Communication Networks, November, Bhopal, 556–559.
  • Tiwari, P., Mishra, B. K., Kumar, S., & Kumar, V. (2017). Implementation of n-gram methodology for rotten tomatoes review dataset sentiment analysis. International Journal of Knowledge Discovery in Bioinformatics (IJKDB), 7(1),30–41.
  • Zhou, Y. (2019). Sentiment Classification with Deep Neural Networks. Master's Thesis. Tampere University. Finland.
  • Sahu, T. P., & Ahuja, S. (2016). Sentiment analysis of movie reviews: A study on feature selection and classification algorithms. International Conference on Microelectronics, Computing, and Communications (MicroCom), 23-25 January, Durgapur, 1–6.
  • Oswin, H. R., Virginia, G., & Antonius, R. C. (2016). Sentiment Classification of Film Reviews Using IB1. 7th International Conference on Intelligent Systems, Modelling, and Simulation (ISMS), 23-25 January, Bangkok 78–82.
  • Mostafa, L. (2021). Egyptian Student Sentiment Analysis Using Word2vec During the Coronavirus (Covid-19) Pandemic. In: Hassanien A.E., Slowik A., Snášel V., El-Deeb H., Tolba F.M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020. AISI 2020. Advances in Intelligent Systems and Computing, vol 1261. Springer, Cham.
  • Machuca, C. R., Gallardo, C., & Toasa, R. M. (2021, February). Twitter sentiment analysis on coronavirus: Machine learning approach. In Journal of Physics: Conference Series (Vol. 1828, No. 1, p. 012104). IOP Publishing.
  • U. A. Siddiqua, T. Ahsan, & A. N. Chy, (2016). Combining a rule-based classifier with ensemble of feature sets and machine learning techniques for sentiment analysis on microblog. in 2016 19th International Conference on Computer and Information Technology (ICCIT), 2016, 304– 309.
  • Rahman, M., & Islam, M. N. (2022). Exploring the performance of ensemble machine learning classifiers for sentiment analysis of covid-19 tweets. In Sentimental Analysis and Deep Learning (pp. 383-396). Springer, Singapore.
  • Alam, M. H., Ryu, W. J., & Lee, S. (2016). Joint multi-grain topic sentiment: modeling semantic aspects for online reviews. Information Sciences, 339, 206–223.
  • Gervais, N. (2019). Rotten Tomatoes Dataset. rotten-tomatoes-dataset (Access Date:21.02.2020).
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality, Advances in Neural Information Processing systems 3111-3119.
  • Basarslan, M. S., & Kayaalp, F. (2020). Sentiment analysis with machine learning methods on social media. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 9(3),5-15.
  • Bakay, M. S., & Ağbulut, Ü. (2021). Electricity production-based forecasting of greenhouse gas emissions in Turkey with deep learning, support vector machine and artificial neural network algorithms. Journal of Cleaner Production, 285, 125324.
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1),21–27.
  • Basarslan, M. S., Bakir, H., & Yücedağ, İ. (2019, April). Fuzzy logic and correlation-based hybrid classification on hepatitis disease data set. In The International Conference on Artificial Intelligence and Applied Mathematics in Engineering (pp. 787-800). Springer, Cham.
  • Indulkar, Y., & Patil, A. (2021). Comparative Study of Machine Learning Algorithms for Twitter Sentiment Analysis. 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), 295–299.
  • Zhou, Z. H. (2012). Ensemble methods: foundations and algorithms. CRC press. Polikar, R. (2006). Ensemble based systems in decision making. IEEE Circuits and systems magazine, 6(3), 21-45.
  • Tao, F., Jiang, L., & Li, C. (2021). Differential evolution-based weighted soft majority voting for crowdsourcing. Engineering Applications of Artificial Intelligence, 106, 104474.
  • Battiti, R., & Colla, A. M. (1994). Democracy in neural nets: Voting schemes for classification. Neural Networks, 7(4), 691-707.
  • Canli, H., & Toklu, S. (2021). Deep Learning-Based Mobile Application Design for Smart Parking. IEEE Access, 9, 61171-61183.
  • Mahima, K. T. Y., Ginige, T. N. D. S., & De Zoysa, K. (2021). Evaluation of Sentiment Analysis based on AutoML and Traditional Approaches. Evaluation, 12(2).
  • Assyafah, H. B., Yulianti, D. T., & Kom, S. (2021). Analisis Dataset menggunakan Sentiment Analysis (Studi Kasus Pada Tripadvisor). Jurnal STRATEGI-Jurnal Maranatha, 3(2), 320-331.
  • Frangidis, P., Georgiou, K., Papadopoulos, S. (2020). Sentiment Analysis on Movie Scripts and Reviews. In: Maglogiannis, I., Iliadis, L., Pimenidis, E. (eds) Artificial Intelligence Applications and Innovations. AIAI 2020. IFIP Advances in Information and Communication Technology, vol 583. Springer, Cham. https://doi.org/10.1007/978-3-030-49161-1_36
Year 2023, , 141 - 148, 15.04.2023
https://doi.org/10.31127/tuje.1079698

Abstract

Supporting Institution

yok

Project Number

yok

References

  • Mostafa, L. (2020). Machine learning-based sentiment analysis for analyzing the travelers reviews on Egyptian hotels. In Joint European-US Workshop on Applications of Invariance in Computer Vision. Springer, Cham, 405-413.
  • Dehkharghani, R., Yanikoglu, B., Tapucu, D., & Saygin, Y. (2012). Adaptation and Use of Subjectivity Lexicons for Domain Dependent Sentiment Classification. IEEE 12th International Conference on Data Mining Workshops, 10 December, Washington, 669–673.
  • Raut, V. B., & Londhe, D. D. (2014). Opinion Mining and Summarization of Hotel Reviews. International Conference on Computational Intelligence and Communication Networks, November, Bhopal, 556–559.
  • Tiwari, P., Mishra, B. K., Kumar, S., & Kumar, V. (2017). Implementation of n-gram methodology for rotten tomatoes review dataset sentiment analysis. International Journal of Knowledge Discovery in Bioinformatics (IJKDB), 7(1),30–41.
  • Zhou, Y. (2019). Sentiment Classification with Deep Neural Networks. Master's Thesis. Tampere University. Finland.
  • Sahu, T. P., & Ahuja, S. (2016). Sentiment analysis of movie reviews: A study on feature selection and classification algorithms. International Conference on Microelectronics, Computing, and Communications (MicroCom), 23-25 January, Durgapur, 1–6.
  • Oswin, H. R., Virginia, G., & Antonius, R. C. (2016). Sentiment Classification of Film Reviews Using IB1. 7th International Conference on Intelligent Systems, Modelling, and Simulation (ISMS), 23-25 January, Bangkok 78–82.
  • Mostafa, L. (2021). Egyptian Student Sentiment Analysis Using Word2vec During the Coronavirus (Covid-19) Pandemic. In: Hassanien A.E., Slowik A., Snášel V., El-Deeb H., Tolba F.M. (eds) Proceedings of the International Conference on Advanced Intelligent Systems and Informatics 2020. AISI 2020. Advances in Intelligent Systems and Computing, vol 1261. Springer, Cham.
  • Machuca, C. R., Gallardo, C., & Toasa, R. M. (2021, February). Twitter sentiment analysis on coronavirus: Machine learning approach. In Journal of Physics: Conference Series (Vol. 1828, No. 1, p. 012104). IOP Publishing.
  • U. A. Siddiqua, T. Ahsan, & A. N. Chy, (2016). Combining a rule-based classifier with ensemble of feature sets and machine learning techniques for sentiment analysis on microblog. in 2016 19th International Conference on Computer and Information Technology (ICCIT), 2016, 304– 309.
  • Rahman, M., & Islam, M. N. (2022). Exploring the performance of ensemble machine learning classifiers for sentiment analysis of covid-19 tweets. In Sentimental Analysis and Deep Learning (pp. 383-396). Springer, Singapore.
  • Alam, M. H., Ryu, W. J., & Lee, S. (2016). Joint multi-grain topic sentiment: modeling semantic aspects for online reviews. Information Sciences, 339, 206–223.
  • Gervais, N. (2019). Rotten Tomatoes Dataset. rotten-tomatoes-dataset (Access Date:21.02.2020).
  • Mikolov, T., Sutskever, I., Chen, K., Corrado, G., & Dean, J. (2013). Distributed Representations of Words and Phrases and their Compositionality, Advances in Neural Information Processing systems 3111-3119.
  • Basarslan, M. S., & Kayaalp, F. (2020). Sentiment analysis with machine learning methods on social media. ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal, 9(3),5-15.
  • Bakay, M. S., & Ağbulut, Ü. (2021). Electricity production-based forecasting of greenhouse gas emissions in Turkey with deep learning, support vector machine and artificial neural network algorithms. Journal of Cleaner Production, 285, 125324.
  • Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1),21–27.
  • Basarslan, M. S., Bakir, H., & Yücedağ, İ. (2019, April). Fuzzy logic and correlation-based hybrid classification on hepatitis disease data set. In The International Conference on Artificial Intelligence and Applied Mathematics in Engineering (pp. 787-800). Springer, Cham.
  • Indulkar, Y., & Patil, A. (2021). Comparative Study of Machine Learning Algorithms for Twitter Sentiment Analysis. 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), 295–299.
  • Zhou, Z. H. (2012). Ensemble methods: foundations and algorithms. CRC press. Polikar, R. (2006). Ensemble based systems in decision making. IEEE Circuits and systems magazine, 6(3), 21-45.
  • Tao, F., Jiang, L., & Li, C. (2021). Differential evolution-based weighted soft majority voting for crowdsourcing. Engineering Applications of Artificial Intelligence, 106, 104474.
  • Battiti, R., & Colla, A. M. (1994). Democracy in neural nets: Voting schemes for classification. Neural Networks, 7(4), 691-707.
  • Canli, H., & Toklu, S. (2021). Deep Learning-Based Mobile Application Design for Smart Parking. IEEE Access, 9, 61171-61183.
  • Mahima, K. T. Y., Ginige, T. N. D. S., & De Zoysa, K. (2021). Evaluation of Sentiment Analysis based on AutoML and Traditional Approaches. Evaluation, 12(2).
  • Assyafah, H. B., Yulianti, D. T., & Kom, S. (2021). Analisis Dataset menggunakan Sentiment Analysis (Studi Kasus Pada Tripadvisor). Jurnal STRATEGI-Jurnal Maranatha, 3(2), 320-331.
  • Frangidis, P., Georgiou, K., Papadopoulos, S. (2020). Sentiment Analysis on Movie Scripts and Reviews. In: Maglogiannis, I., Iliadis, L., Pimenidis, E. (eds) Artificial Intelligence Applications and Innovations. AIAI 2020. IFIP Advances in Information and Communication Technology, vol 583. Springer, Cham. https://doi.org/10.1007/978-3-030-49161-1_36
There are 26 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Muhammet Sinan Başarslan 0000-0002-7996-9169

Fatih Kayaalp 0000-0002-8752-3335

Project Number yok
Publication Date April 15, 2023
Published in Issue Year 2023

Cite

APA Başarslan, M. S., & Kayaalp, F. (2023). Sentiment analysis with ensemble and machine learning methods in multi-domain datasets. Turkish Journal of Engineering, 7(2), 141-148. https://doi.org/10.31127/tuje.1079698
AMA Başarslan MS, Kayaalp F. Sentiment analysis with ensemble and machine learning methods in multi-domain datasets. TUJE. April 2023;7(2):141-148. doi:10.31127/tuje.1079698
Chicago Başarslan, Muhammet Sinan, and Fatih Kayaalp. “Sentiment Analysis With Ensemble and Machine Learning Methods in Multi-Domain Datasets”. Turkish Journal of Engineering 7, no. 2 (April 2023): 141-48. https://doi.org/10.31127/tuje.1079698.
EndNote Başarslan MS, Kayaalp F (April 1, 2023) Sentiment analysis with ensemble and machine learning methods in multi-domain datasets. Turkish Journal of Engineering 7 2 141–148.
IEEE M. S. Başarslan and F. Kayaalp, “Sentiment analysis with ensemble and machine learning methods in multi-domain datasets”, TUJE, vol. 7, no. 2, pp. 141–148, 2023, doi: 10.31127/tuje.1079698.
ISNAD Başarslan, Muhammet Sinan - Kayaalp, Fatih. “Sentiment Analysis With Ensemble and Machine Learning Methods in Multi-Domain Datasets”. Turkish Journal of Engineering 7/2 (April 2023), 141-148. https://doi.org/10.31127/tuje.1079698.
JAMA Başarslan MS, Kayaalp F. Sentiment analysis with ensemble and machine learning methods in multi-domain datasets. TUJE. 2023;7:141–148.
MLA Başarslan, Muhammet Sinan and Fatih Kayaalp. “Sentiment Analysis With Ensemble and Machine Learning Methods in Multi-Domain Datasets”. Turkish Journal of Engineering, vol. 7, no. 2, 2023, pp. 141-8, doi:10.31127/tuje.1079698.
Vancouver Başarslan MS, Kayaalp F. Sentiment analysis with ensemble and machine learning methods in multi-domain datasets. TUJE. 2023;7(2):141-8.
Flag Counter