Research Article
BibTex RIS Cite

Apache Spark Tabanlı Duygu Analizi

Year 2021, Volume: 4 Issue: 3, 233 - 241, 15.12.2021
https://doi.org/10.47495/okufbed.928826

Abstract

Bu çalışmada, büyük verileri bellek içi hesaplama yöntemi ile hızlı bir şekilde işleyebilen Apache Spark açık kaynak kodlu çerçeve kullanılarak duygu analizi gerçekleştirilmiştir. Duygu analizi işleminde Spark içerisinde bulunan MLlib makine öğrenimi kütüphanesi kullanılmıştır. Lojistik regresyon (LR), destek vektör makinesi (DVM) ve Naive Bayes sınıflandırma algoritmalarının kullanıldığı bu çalışmada, algoritmaların farklı ölçütlere göre performans değerlendirmeleri yapılmaktadır.

References

  • Mohapatra, S., Ahmed, N., and Alencar, P., KryptoOracle: A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments. 2019 IEEE International Conference on Big Data (Big Data), p. 5544-5551.2019.
  • Kouloumpis, E., Wilson, T. and Moore, J., Twitter sentiment analysis: The good the bad and the omg! In Proceedings of the International AAAI Conference on Web and Social Media. 2011.
  • Pang, B., Lee, L. And Vaithyanathan, S., Thumbs up? Sentiment classification using machine learning techniques. arXiv preprint cs/0205070, 2002.
  • Wang, H., Can, D., Kazemzadeh, A., Bar, F., and Narayanan, S., A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. in Proceedings of the ACL 2012 system demonstrations. 2012.
  • Neethu, M.S. and Rajasree, R., Sentiment Analysis in Twitter using Machine Learning Techniques. 2013 Fourth International Conference on Computing, Communications and Networking Technologies (Icccnt), 2013.
  • Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S. and Stoica, I., Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In 9th {USENIX} Symposium on Networked Systems Design and Implementation, 12, 15-28.2012.
  • Meng, X.R., et al., MLlib: Machine Learning in Apache Spark. Journal of Machine Learning Research, 2016. 17.
  • Yelp Review Sentiment Dataset, 2021. https://www.kaggle.com/ilhamfp31/yelp-review-dataset. (Erişim tarihi: 19.02.2021).
  • IMDB Dataset of 50K Movie Reviews, 2021. https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews. (Erişim tarihi: 19.02.2021)
  • Goel, A., Gautam, J. and Kumar, S., Real Time Sentiment Analysis of Tweets Using Naive Bayes. Proceedings on 2016 2nd International Conference on Next Generation Computing Technologies (Ngct), p. 257-261.2016.
  • Jain, A.P. and Dandannavar, P., Application of Machine Learning Techniques to Sentiment Analysis. Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (Icatcct),p. 628-632.2016.
  • Hosmer, D.W., Lemeshow, S. and Sturdivant, R.X., Applied Logistic Regression Third Edition Preface. Applied Logistic Regression, 3rd Edition, p. Xiii-+.2013.
  • Zhu, G.B. and D.G. Blumberg, Classification using ASTER data and SVM algorithms; The case study of Beer Sheva, Israel. Remote Sensing of Environment,80(2): p. 233-240. 2002.

Apache Spark Based Sentiment Analysis

Year 2021, Volume: 4 Issue: 3, 233 - 241, 15.12.2021
https://doi.org/10.47495/okufbed.928826

Abstract

In this study, emotion analysis is carried out using the Apache Spark open source framework, which is capable of processing big data quickly with the method of computing in memory. MLlib machine learning library in Spark was used in the sentiment analysis process. Logistic regression (LR), support vector machine (DVM), and Naive Bayes classification algorithms are used for performance evaluations according to different criteria.

References

  • Mohapatra, S., Ahmed, N., and Alencar, P., KryptoOracle: A Real-Time Cryptocurrency Price Prediction Platform Using Twitter Sentiments. 2019 IEEE International Conference on Big Data (Big Data), p. 5544-5551.2019.
  • Kouloumpis, E., Wilson, T. and Moore, J., Twitter sentiment analysis: The good the bad and the omg! In Proceedings of the International AAAI Conference on Web and Social Media. 2011.
  • Pang, B., Lee, L. And Vaithyanathan, S., Thumbs up? Sentiment classification using machine learning techniques. arXiv preprint cs/0205070, 2002.
  • Wang, H., Can, D., Kazemzadeh, A., Bar, F., and Narayanan, S., A system for real-time twitter sentiment analysis of 2012 us presidential election cycle. in Proceedings of the ACL 2012 system demonstrations. 2012.
  • Neethu, M.S. and Rajasree, R., Sentiment Analysis in Twitter using Machine Learning Techniques. 2013 Fourth International Conference on Computing, Communications and Networking Technologies (Icccnt), 2013.
  • Zaharia, M., Chowdhury, M., Das, T., Dave, A., Ma, J., McCauley, M., Franklin, M.J., Shenker, S. and Stoica, I., Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing. In 9th {USENIX} Symposium on Networked Systems Design and Implementation, 12, 15-28.2012.
  • Meng, X.R., et al., MLlib: Machine Learning in Apache Spark. Journal of Machine Learning Research, 2016. 17.
  • Yelp Review Sentiment Dataset, 2021. https://www.kaggle.com/ilhamfp31/yelp-review-dataset. (Erişim tarihi: 19.02.2021).
  • IMDB Dataset of 50K Movie Reviews, 2021. https://www.kaggle.com/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews. (Erişim tarihi: 19.02.2021)
  • Goel, A., Gautam, J. and Kumar, S., Real Time Sentiment Analysis of Tweets Using Naive Bayes. Proceedings on 2016 2nd International Conference on Next Generation Computing Technologies (Ngct), p. 257-261.2016.
  • Jain, A.P. and Dandannavar, P., Application of Machine Learning Techniques to Sentiment Analysis. Proceedings of the 2016 2nd International Conference on Applied and Theoretical Computing and Communication Technology (Icatcct),p. 628-632.2016.
  • Hosmer, D.W., Lemeshow, S. and Sturdivant, R.X., Applied Logistic Regression Third Edition Preface. Applied Logistic Regression, 3rd Edition, p. Xiii-+.2013.
  • Zhu, G.B. and D.G. Blumberg, Classification using ASTER data and SVM algorithms; The case study of Beer Sheva, Israel. Remote Sensing of Environment,80(2): p. 233-240. 2002.
There are 13 citations in total.

Details

Primary Language Turkish
Subjects Computer Software
Journal Section RESEARCH ARTICLES
Authors

Emre Yıldırım 0000-0002-9072-9780

Ali Çalhan

Publication Date December 15, 2021
Submission Date April 27, 2021
Acceptance Date August 2, 2021
Published in Issue Year 2021 Volume: 4 Issue: 3

Cite

APA Yıldırım, E., & Çalhan, A. (2021). Apache Spark Tabanlı Duygu Analizi. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 4(3), 233-241. https://doi.org/10.47495/okufbed.928826
AMA Yıldırım E, Çalhan A. Apache Spark Tabanlı Duygu Analizi. Osmaniye Korkut Ata University Journal of The Institute of Science and Techno. December 2021;4(3):233-241. doi:10.47495/okufbed.928826
Chicago Yıldırım, Emre, and Ali Çalhan. “Apache Spark Tabanlı Duygu Analizi”. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi 4, no. 3 (December 2021): 233-41. https://doi.org/10.47495/okufbed.928826.
EndNote Yıldırım E, Çalhan A (December 1, 2021) Apache Spark Tabanlı Duygu Analizi. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi 4 3 233–241.
IEEE E. Yıldırım and A. Çalhan, “Apache Spark Tabanlı Duygu Analizi”, Osmaniye Korkut Ata University Journal of The Institute of Science and Techno, vol. 4, no. 3, pp. 233–241, 2021, doi: 10.47495/okufbed.928826.
ISNAD Yıldırım, Emre - Çalhan, Ali. “Apache Spark Tabanlı Duygu Analizi”. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi 4/3 (December 2021), 233-241. https://doi.org/10.47495/okufbed.928826.
JAMA Yıldırım E, Çalhan A. Apache Spark Tabanlı Duygu Analizi. Osmaniye Korkut Ata University Journal of The Institute of Science and Techno. 2021;4:233–241.
MLA Yıldırım, Emre and Ali Çalhan. “Apache Spark Tabanlı Duygu Analizi”. Osmaniye Korkut Ata Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 4, no. 3, 2021, pp. 233-41, doi:10.47495/okufbed.928826.
Vancouver Yıldırım E, Çalhan A. Apache Spark Tabanlı Duygu Analizi. Osmaniye Korkut Ata University Journal of The Institute of Science and Techno. 2021;4(3):233-41.

23487


196541947019414

19433194341943519436 1960219721 197842261021238 23877

*This journal is an international refereed journal 

*Our journal does not charge any article processing fees over publication process.

* This journal is online publishes 5 issues per year (January, March, June, September, December)

*This journal published in Turkish and English as open access. 

19450 This work is licensed under a Creative Commons Attribution 4.0 International License.