Research Article

SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS

Volume: 59 Number: 2 December 21, 2017
EN

SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS

Abstract

Sentiment analysis is an active research area since early 2000s as a field of text classification. Most of the studies in this field focus on the analysis using the text in English language, where the Turkish and the other languages have fallen behind. The purpose of this research is to contribute to the text analysis in Turkish language using the contents that we access through web sites. In particular, we deduce the sentiment behind noisy product reviews and comments in a highly popular commercial web page. In this context, we generate a unique dataset that includes 9100 product review samples for training our classification model. There are different word representation methods that are utilized in sentiment analysis, such as bag-of-words and n-gram models. In this work, we generated our word models using the word2vec algorithm. In this model, each word in the vocabulary is represented as a vector of 300 dimensions. We utilize 70% of our dataset in the training of a Random Forest Model and make binary classification of sentiments as being positive or negative, utilizing the ratings of the user for the product as classification labels. In the highly noisy and unfiltered comments, we achieve an accuracy of 84.23%.

Keywords

References

  1. Wiebe, J. “Learning Subjective Adjectives from Corpora”, Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence, July 30- August 03 (2000): 735-740.
  2. Das, S.R. and Chen, M. Y. 2001. “Yahoo! for Amazon: Extracting Market Sentiment from Stock Message Boards”. In Proceedings of the 8th Asia Pacific Finance Association Annual Conference, (2001).
  3. Morinaga, S., Yamanishi, K., Tateishi, K. and Fukushima, T. “Mining Product Reputations on the Web”. In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, (2002).
  4. Tong, R. M. “An Operational System for Detecting and Tracking Opinions in On-Line Discussion”. In Proceedings of SIGIR Workshop on Operational Text Classification, (2001).
  5. Pang, B., Lee, L. and Vaithyanathan. S. “Thumbs up? Sentiment Classification Using Machine Learning Techniques”. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), (2002): 79–86.
  6. Turney, P. 2002, “Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews”. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, (2002): 417–424.
  7. Nasukawa, T. and Yi, Jeonghee. “Sentiment analysis: Capturing Favorability Using Natural Language Processing”. In Proceedings of the KCAP-03, 2nd Intl. Conf. on Knowledge Capture, (2003).
  8. Bollen, J., Mao, H. and Zeng, X. 2010. “Twitter Mood Predicts the Stock Market”. Journal of Computational Science, (2010): 2(1), 1–8.

Details

Primary Language

English

Subjects

-

Journal Section

Research Article

Publication Date

December 21, 2017

Submission Date

November 11, 2017

Acceptance Date

December 21, 2017

Published in Issue

Year 1970 Volume: 59 Number: 2

APA
Pervan, N., & Yalım Keleş, H. (2017). SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, 59(2), 69-79. https://izlik.org/JA32EK62RG
AMA
1.Pervan N, Yalım Keleş H. SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS. Commun.Fac.Sci.Univ.Ank.Series A2-A3: Phys.Sci. and Eng. 2017;59(2):69-79. https://izlik.org/JA32EK62RG
Chicago
Pervan, Nergis, and Hacer Yalım Keleş. 2017. “SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS”. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 59 (2): 69-79. https://izlik.org/JA32EK62RG.
EndNote
Pervan N, Yalım Keleş H (December 1, 2017) SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 59 2 69–79.
IEEE
[1]N. Pervan and H. Yalım Keleş, “SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS”, Commun.Fac.Sci.Univ.Ank.Series A2-A3: Phys.Sci. and Eng., vol. 59, no. 2, pp. 69–79, Dec. 2017, [Online]. Available: https://izlik.org/JA32EK62RG
ISNAD
Pervan, Nergis - Yalım Keleş, Hacer. “SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS”. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering 59/2 (December 1, 2017): 69-79. https://izlik.org/JA32EK62RG.
JAMA
1.Pervan N, Yalım Keleş H. SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS. Commun.Fac.Sci.Univ.Ank.Series A2-A3: Phys.Sci. and Eng. 2017;59:69–79.
MLA
Pervan, Nergis, and Hacer Yalım Keleş. “SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS”. Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering, vol. 59, no. 2, Dec. 2017, pp. 69-79, https://izlik.org/JA32EK62RG.
Vancouver
1.Nergis Pervan, Hacer Yalım Keleş. SENTIMENT ANALYSIS USING A RANDOM FOREST CLASSIFIER ON TURKISH WEB COMMENTS. Commun.Fac.Sci.Univ.Ank.Series A2-A3: Phys.Sci. and Eng. [Internet]. 2017 Dec. 1;59(2):69-7. Available from: https://izlik.org/JA32EK62RG

Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering licensed under a Creative Commons Attribution 4.0 International License.

Creative Commons License