Araştırma Makalesi
BibTex RIS Kaynak Göster

Examining the Effect of Text Representation Methods on Classification Performance in Sentiment Analysis Based on Machine Learning and Deep Learning

Yıl 2025, Cilt: 15 Sayı: 2, 923 - 959, 15.06.2025
https://doi.org/10.31466/kfbd.1536270

Öz

In this study, binary and multi-class classification were performed on two different datasets using various text representation methods in conjunction with multiple machine learning and deep learning algorithms. The text representation methods employed include TF-IDF, GloVe, Word2Vec, FastText, and Bag of Words. The machine learning algorithms applied were Naive Bayes, Lojistik Regresyon, Support Vector Machines, Random Forest, Artificial Neural Networks, K-Nearest Neighbors, Decision Trees, XGBoost, and LightGBM. For deep learning algorithms, Convolutional Neural Networks, Recurrent Neural Networks, and Long Short-Term Memory were utilized. The performance of the text representation methods and algorithms was compared based on the results obtained. In the Amazon dataset, the highest accuracy among the machine learning methods was achieved by the LightGBM algorithm. Among the deep learning methods, the LSTM algorithm using TF-IDF and FastText achieved the highest accuracy. In the IMDb dataset, the Lojistik Regresyon algorithm provided the highest accuracy among the machine learning methods, while the LSTM algorithm using FastText achieved the highest accuracy among the deep learning methods.

Kaynakça

  • Aydın, Z. E., Ozturk, Z. K., & Cıcek, Z. I. E. (2021). Turkish sentiment analysis for open and distance education systems. Turkish Online Journal of Distance Education, 22(3), 124-138.
  • Başarslan, M. S., & Kayaalp, F. (2023). Sentiment analysis with ensemble and machine learning methods in multi-domain datasets. Turkish Journal of Engineering, 7(2), 141-148.
  • Bhasin, A., & Das, S. (2021, May). Twitter sentiment analysis using Machine Learning and Hadoop: A comparative study. In 2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC) (pp. 267-272). IEEE.
  • Can, U., & Alatas, B. (2017). Duygu analizi ve fikir madenciliği algoritmalarının incelenmesi. International Journal of Pure and Applied Sciences, 3(1), 75-111.
  • Chandra, Y., & Jana, A. (2020, March). Sentiment analysis using machine learning and deep learning. In 2020 7th international conference on computing for sustainable global development (INDIACom) (pp. 1-4). IEEE.
  • Chen, L. C., Lee, C. M., & Chen, M. Y. (2020). Exploration of social media for sentiment analysis using deep learning. Soft Computing, 24(11), 8187-8197.
  • Dhariwal, N., Akunuri, S. C., & Banu, K. S. (2023). Audio and Text Sentiment Analysis of Radio Broadcasts. IEEE Access, 11, 126900-126916.
  • Demir, E., & Bilgin, M. (2023, September). Sentiment Analysis from Turkish News Texts with BERT-Based Language Models and Machine Learning Algorithms. In 2023 8th International Conference on Computer Science and Engineering (UBMK) (pp. 01-04). IEEE.
  • Fu, X., Yang, J., Li, J., Fang, M., & Wang, H. (2018). Lexicon-enhanced LSTM with attention for general sentiment analysis. IEEE Access, 6, 71884-71891.
  • Hammad, M., & Anwar, H. (2019, November). Sentiment analysis of sindhi tweets dataset using supervised machine learning techniques. In 2019 22nd International Multitopic Conference (INMIC) (pp. 1-6). IEEE.
  • Hassan, S. M., Khan, J., Khan, M. A., Khan, M. S., Ahmad, I., & Khan, M.(2021). Detecting COVID-19 Pandemic Using Sentiment Analysis of Tweets. Artificial Intelligence Theory and Applications, 1(2), 39-47.
  • He, H., Zhou, G., & Zhao, S. (2022). Exploring E-commerce product experience based on fusion sentiment analysis method. IEEE Access, 10, 110248-110260.
  • Jagadeesan, M., Saravanan, T. M., Selvaraj, P. A., Ali, U. A., Arunsivaraj, J., & Balasubramanian, S. (2022, December). Twitter Sentiment Analysis with Machine Learning. In 2022 International Conference on Automation, Computing and Renewable Systems (ICACRS) (pp. 681-686). IEEE.
  • Jayakody, J. P. U. S. D., & Kumara, B. T. G. S. (2021, December). Sentiment analysis on product reviews on twitter using Machine Learning Approaches. In 2021 International Conference on Decision Aid Sciences and Application (DASA) (pp. 1056-1061). IEEE.
  • Joshi, V., Patel, S., Agarwal, R., & Arora, H. (2023, March). Sentiments Analysis using Machine Learning Algorithms. In 2023 Second International Conference on Electronics and Renewable Systems (ICEARS) (pp. 1425-1429). IEEE.
  • Karayiğit, H., Acı, Ç., & Akdağlı, A. (2018). A Review of Turkish Sentiment Analysis and Opinion Mining. Balkan Journal of Electrical and Computer Engineering, 6(2), 94-98.
  • Kim, R. Y. (2021). Using online reviews for customer sentiment analysis. IEEE Engineering Management Review, 49(4), 162-168.
  • Kemaloğlu, N., Küçüksille, E., & Özgünsür, M. (2021). Turkish sentiment analysis on social media. Sakarya University Journal of Science, 25(3), 629-638.
  • Li, Z., Li, R., & Jin, G. (2020). Sentiment analysis of danmaku videos based on naïve bayes and sentiment dictionary. IEEE Access, 8, 75073-75084.
  • Liu, H., Chen, X., & Liu, X. (2022). A study of the application of weight distributing method combining sentiment dictionary and TF-IDF for text sentiment analysis. IEEE Access, 10, 32280-32289.
  • Onan, A. (2018). Sentiment analysis on Twitter based on ensemble of psychological and linguistic feature sets. Balkan Journal of Electrical and Computer Engineering, 6(2), 69-77.
  • Öğe, B. C., & Kayaalp, F. (2021). Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması. Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 9(6), 406-416.
  • Pandya, V., Somthankar, A., Shrivastava, S. S., & Patil, M. (2021, December). Twitter sentiment analysis using machine learning and deep learning techniques. In 2021 2nd International Conference on Communication, Computing and Industry 4.0 (C2I4) (pp. 1-5). IEEE.
  • Patil, R., Boit, S., Gudivada, V., & Nandigam, J. (2023). A survey of text representation and embedding techniques in nlp. IEEE Access, 11, 36120-36146.
  • Poornima, A., & Priya, K. S. (2020, March). A comparative sentiment analysis of sentence embedding using machine learning techniques. In 2020 6th international conference on advanced computing and communication systems (ICACCS) (pp. 493-496). IEEE.
  • Poria, S., Hazarika, D., Majumder, N., & Mihalcea, R. (2020). Beneath the tip of the iceberg: Current challenges and new directions in sentiment analysis research. IEEE transactions on affective computing, 14(1), 108-132.
  • Rawat, A., Maheshwari, H., Khanduja, M., Kumar, R., Memoria, M., & Kumar, S.(2022, May). Sentiment analysis of Covid19 vaccines tweets using NLP and machine learning classifiers. In 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON) (pp. 225-230). IEEE.
  • Rizk, Y. E., & Asal, W. M. (2021, October). Sentiment analysis using machine learning and deep learning models on movies reviews. In 2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES) (pp. 129-132). IEEE.
  • Rumelli, M., Akkuş, D., Kart, Ö., & Isik, Z. (2019, October). Sentiment analysis in Turkish text with machine learning algorithms. In 2019 Innovations in intelligent systems and applications conference (ASYU) (pp. 1-5). IEEE.
  • Salur, M. U., & Aydın, İ. (2022). Türkçe tweetler için derin özellik çıkarımı tabanlı yeni bir duygu sınıflandırma modeli. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 34(1), 1-13.
  • Satrya, R. N., Pratiwi, O. N., Fa'rifah, R. Y., & Abawajy, J. (2022, November). Cryptocurrency sentiment analysis on the twitter platform using support vector machine (svm) algorithm. In 2022 International Conference Advancement in Data Science, E-learning and Information Systems (ICADEIS) (pp. 01-05). IEEE.
  • Sarkar, K., & Bhowmick, M. (2017, December). Sentiment polarity detection in bengali tweets using multinomial Naïve Bayes and support vector machines. In 2017 IEEE Calcutta Conference (CALCON) (pp. 31-36). IEEE.
  • Saraswat, S., Bhardwaj, S., Vashistha, S., & Kumar, R. (2023, March). Sentiment Analysis of Audio Files Using Machine Learning and Textual Classification of Audio Data. In 2023 6th International Conference on Information Systems and Computer Networks (ISCON) (pp. 1-5). IEEE.
  • Sharma, S., Pandey, A., Kumar, V., Ohdar, D., Pillai, A. R., & Mahajan, M. (2023, May). Recent Trends in Sentiment Analysis using Different Machine Learning based Models: A Short Review. In 2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC) (pp. 474-481). IEEE.
  • Singh, S., Kumar, K., & Kumar, B. (2022, May). Sentiment analysis of Twitter data using TF-IDF and machine learning techniques. In 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON) (pp. 252-255). IEEE.
  • Tan, Y. Y., Chow, C. O., Kanesan, J., Chuah, J. H., & Lim, Y. (2023). Sentiment analysis and sarcasm detection using deep multi-task learning. Wireless personal communications, 129(3), 2213-2237.
  • Tsai, M. F., & Huang, J. Y. (2021). Sentiment analysis of pets using deep learning technologies in artificial intelligence of things system. Soft Computing, 25(21), 13741-13752.
  • Wang, Y., Huang, G., Li, J., Li, H., Zhou, Y., & Jiang, H. (2021). Refined global word embeddings based on sentiment concept for sentiment analysis. IEEE Access, 9, 37075-37085.
  • Xu, L., & Song, Y. (2023, April). Comparison of text sentiment analysis based on traditional machine learning and deep learning methods. In 2023 4th International Conference on Computer Engineering and Application (ICCEA) (pp. 692-695). IEEE.
  • Zhang, X., & Zheng, X. (2016, July). Comparison of text sentiment analysis based on machine learning. In 2016 15th international symposium on parallel and distributed computing (ISPDC) (pp. 230-233). IEEE.

Makine Öğrenmesi ve Derin Öğrenmeye Dayalı Duygu Analizinde Metin Temsil Yöntemlerinin Sınıflandırma Başarımına Etkisinin İncelenmesi

Yıl 2025, Cilt: 15 Sayı: 2, 923 - 959, 15.06.2025
https://doi.org/10.31466/kfbd.1536270

Öz

Bu çalışmada, iki farklı veri seti üzerinde çeşitli metin temsil yöntemleri kullanılarak ikili ve üçlü sınıflandırma işlemleri gerçekleştirilmiştir. Metin temsil yöntemleri olarak TF-IDF, GloVe, Word2Vec, FastText ve Bag of Words kullanılmıştır. Makine öğrenimi algoritmalarından Naive Bayes, Lojistik Regresyon, Destek Vektör Makineleri, Rastgele Orman, Yapay Sinir Ağı, En Yakın Komşu Algoritması, Karar Ağacı, XGBoost ve LightGBM uygulanmıştır. Derin öğrenme algoritmaları olarak ise Evrişimli Sinir Ağı, Tekrarlayan Sinir Ağı ve Uzun Kısa Süreli Bellek kullanılmıştır. Elde edilen sonuçlarla, kullanılan metin temsil yöntemleri ve algoritmaların performansları karşılaştırılmıştır. Amazon veri setinde, makine öğrenimi yöntemleri arasında en yüksek doğruluk oranı LightGBM algoritması, derin öğrenme yöntemleri arasında ise TF-IDF ve FastText kullanan LSTM algoritması tarafından elde edilmiştir. IMDb veri setinde, makine öğrenimi yöntemleri arasında en yüksek doğruluk oranı Lojistik Regresyon algoritması, derin öğrenme yöntemleri arasında ise FastText kullanan LSTM algoritması tarafından elde edilmiştir.

Etik Beyan

Yapılan çalışmada araştırma ve yayın etiğine uyulmuştur.

Teşekkür

Bu çalışmada açık kaynaklı iki veri kümesi kullanılmıştır. İlk veri kümesine https://www.kaggle.com/datasets/eswarchandt/amazon-music-reviews bağlantından erişilebilir. İkinci veri kümesine ise https://www.kaggle.com/datasets/lakshmi25npathi/imdb-dataset-of-50k-movie-reviews bağlantından erişilebilir. Bu veri kümelerini hazırlayıp kullanıma sunan kuruluşlara teşekkür ederiz.

Kaynakça

  • Aydın, Z. E., Ozturk, Z. K., & Cıcek, Z. I. E. (2021). Turkish sentiment analysis for open and distance education systems. Turkish Online Journal of Distance Education, 22(3), 124-138.
  • Başarslan, M. S., & Kayaalp, F. (2023). Sentiment analysis with ensemble and machine learning methods in multi-domain datasets. Turkish Journal of Engineering, 7(2), 141-148.
  • Bhasin, A., & Das, S. (2021, May). Twitter sentiment analysis using Machine Learning and Hadoop: A comparative study. In 2021 2nd International Conference on Secure Cyber Computing and Communications (ICSCCC) (pp. 267-272). IEEE.
  • Can, U., & Alatas, B. (2017). Duygu analizi ve fikir madenciliği algoritmalarının incelenmesi. International Journal of Pure and Applied Sciences, 3(1), 75-111.
  • Chandra, Y., & Jana, A. (2020, March). Sentiment analysis using machine learning and deep learning. In 2020 7th international conference on computing for sustainable global development (INDIACom) (pp. 1-4). IEEE.
  • Chen, L. C., Lee, C. M., & Chen, M. Y. (2020). Exploration of social media for sentiment analysis using deep learning. Soft Computing, 24(11), 8187-8197.
  • Dhariwal, N., Akunuri, S. C., & Banu, K. S. (2023). Audio and Text Sentiment Analysis of Radio Broadcasts. IEEE Access, 11, 126900-126916.
  • Demir, E., & Bilgin, M. (2023, September). Sentiment Analysis from Turkish News Texts with BERT-Based Language Models and Machine Learning Algorithms. In 2023 8th International Conference on Computer Science and Engineering (UBMK) (pp. 01-04). IEEE.
  • Fu, X., Yang, J., Li, J., Fang, M., & Wang, H. (2018). Lexicon-enhanced LSTM with attention for general sentiment analysis. IEEE Access, 6, 71884-71891.
  • Hammad, M., & Anwar, H. (2019, November). Sentiment analysis of sindhi tweets dataset using supervised machine learning techniques. In 2019 22nd International Multitopic Conference (INMIC) (pp. 1-6). IEEE.
  • Hassan, S. M., Khan, J., Khan, M. A., Khan, M. S., Ahmad, I., & Khan, M.(2021). Detecting COVID-19 Pandemic Using Sentiment Analysis of Tweets. Artificial Intelligence Theory and Applications, 1(2), 39-47.
  • He, H., Zhou, G., & Zhao, S. (2022). Exploring E-commerce product experience based on fusion sentiment analysis method. IEEE Access, 10, 110248-110260.
  • Jagadeesan, M., Saravanan, T. M., Selvaraj, P. A., Ali, U. A., Arunsivaraj, J., & Balasubramanian, S. (2022, December). Twitter Sentiment Analysis with Machine Learning. In 2022 International Conference on Automation, Computing and Renewable Systems (ICACRS) (pp. 681-686). IEEE.
  • Jayakody, J. P. U. S. D., & Kumara, B. T. G. S. (2021, December). Sentiment analysis on product reviews on twitter using Machine Learning Approaches. In 2021 International Conference on Decision Aid Sciences and Application (DASA) (pp. 1056-1061). IEEE.
  • Joshi, V., Patel, S., Agarwal, R., & Arora, H. (2023, March). Sentiments Analysis using Machine Learning Algorithms. In 2023 Second International Conference on Electronics and Renewable Systems (ICEARS) (pp. 1425-1429). IEEE.
  • Karayiğit, H., Acı, Ç., & Akdağlı, A. (2018). A Review of Turkish Sentiment Analysis and Opinion Mining. Balkan Journal of Electrical and Computer Engineering, 6(2), 94-98.
  • Kim, R. Y. (2021). Using online reviews for customer sentiment analysis. IEEE Engineering Management Review, 49(4), 162-168.
  • Kemaloğlu, N., Küçüksille, E., & Özgünsür, M. (2021). Turkish sentiment analysis on social media. Sakarya University Journal of Science, 25(3), 629-638.
  • Li, Z., Li, R., & Jin, G. (2020). Sentiment analysis of danmaku videos based on naïve bayes and sentiment dictionary. IEEE Access, 8, 75073-75084.
  • Liu, H., Chen, X., & Liu, X. (2022). A study of the application of weight distributing method combining sentiment dictionary and TF-IDF for text sentiment analysis. IEEE Access, 10, 32280-32289.
  • Onan, A. (2018). Sentiment analysis on Twitter based on ensemble of psychological and linguistic feature sets. Balkan Journal of Electrical and Computer Engineering, 6(2), 69-77.
  • Öğe, B. C., & Kayaalp, F. (2021). Farklı Sınıflandırma Algoritmaları ve Metin Temsil Yöntemlerinin Duygu Analizinde Performans Karşılaştırılması. Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 9(6), 406-416.
  • Pandya, V., Somthankar, A., Shrivastava, S. S., & Patil, M. (2021, December). Twitter sentiment analysis using machine learning and deep learning techniques. In 2021 2nd International Conference on Communication, Computing and Industry 4.0 (C2I4) (pp. 1-5). IEEE.
  • Patil, R., Boit, S., Gudivada, V., & Nandigam, J. (2023). A survey of text representation and embedding techniques in nlp. IEEE Access, 11, 36120-36146.
  • Poornima, A., & Priya, K. S. (2020, March). A comparative sentiment analysis of sentence embedding using machine learning techniques. In 2020 6th international conference on advanced computing and communication systems (ICACCS) (pp. 493-496). IEEE.
  • Poria, S., Hazarika, D., Majumder, N., & Mihalcea, R. (2020). Beneath the tip of the iceberg: Current challenges and new directions in sentiment analysis research. IEEE transactions on affective computing, 14(1), 108-132.
  • Rawat, A., Maheshwari, H., Khanduja, M., Kumar, R., Memoria, M., & Kumar, S.(2022, May). Sentiment analysis of Covid19 vaccines tweets using NLP and machine learning classifiers. In 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON) (pp. 225-230). IEEE.
  • Rizk, Y. E., & Asal, W. M. (2021, October). Sentiment analysis using machine learning and deep learning models on movies reviews. In 2021 3rd Novel Intelligent and Leading Emerging Sciences Conference (NILES) (pp. 129-132). IEEE.
  • Rumelli, M., Akkuş, D., Kart, Ö., & Isik, Z. (2019, October). Sentiment analysis in Turkish text with machine learning algorithms. In 2019 Innovations in intelligent systems and applications conference (ASYU) (pp. 1-5). IEEE.
  • Salur, M. U., & Aydın, İ. (2022). Türkçe tweetler için derin özellik çıkarımı tabanlı yeni bir duygu sınıflandırma modeli. Fırat Üniversitesi Mühendislik Bilimleri Dergisi, 34(1), 1-13.
  • Satrya, R. N., Pratiwi, O. N., Fa'rifah, R. Y., & Abawajy, J. (2022, November). Cryptocurrency sentiment analysis on the twitter platform using support vector machine (svm) algorithm. In 2022 International Conference Advancement in Data Science, E-learning and Information Systems (ICADEIS) (pp. 01-05). IEEE.
  • Sarkar, K., & Bhowmick, M. (2017, December). Sentiment polarity detection in bengali tweets using multinomial Naïve Bayes and support vector machines. In 2017 IEEE Calcutta Conference (CALCON) (pp. 31-36). IEEE.
  • Saraswat, S., Bhardwaj, S., Vashistha, S., & Kumar, R. (2023, March). Sentiment Analysis of Audio Files Using Machine Learning and Textual Classification of Audio Data. In 2023 6th International Conference on Information Systems and Computer Networks (ISCON) (pp. 1-5). IEEE.
  • Sharma, S., Pandey, A., Kumar, V., Ohdar, D., Pillai, A. R., & Mahajan, M. (2023, May). Recent Trends in Sentiment Analysis using Different Machine Learning based Models: A Short Review. In 2023 2nd International Conference on Applied Artificial Intelligence and Computing (ICAAIC) (pp. 474-481). IEEE.
  • Singh, S., Kumar, K., & Kumar, B. (2022, May). Sentiment analysis of Twitter data using TF-IDF and machine learning techniques. In 2022 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COM-IT-CON) (pp. 252-255). IEEE.
  • Tan, Y. Y., Chow, C. O., Kanesan, J., Chuah, J. H., & Lim, Y. (2023). Sentiment analysis and sarcasm detection using deep multi-task learning. Wireless personal communications, 129(3), 2213-2237.
  • Tsai, M. F., & Huang, J. Y. (2021). Sentiment analysis of pets using deep learning technologies in artificial intelligence of things system. Soft Computing, 25(21), 13741-13752.
  • Wang, Y., Huang, G., Li, J., Li, H., Zhou, Y., & Jiang, H. (2021). Refined global word embeddings based on sentiment concept for sentiment analysis. IEEE Access, 9, 37075-37085.
  • Xu, L., & Song, Y. (2023, April). Comparison of text sentiment analysis based on traditional machine learning and deep learning methods. In 2023 4th International Conference on Computer Engineering and Application (ICCEA) (pp. 692-695). IEEE.
  • Zhang, X., & Zheng, X. (2016, July). Comparison of text sentiment analysis based on machine learning. In 2016 15th international symposium on parallel and distributed computing (ISPDC) (pp. 230-233). IEEE.
Toplam 40 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Bilgisayar Yazılımı, Yazılım Mühendisliği (Diğer)
Bölüm Makaleler
Yazarlar

Aleyna Kahraman 0009-0000-8418-6778

Durmuş Özkan Şahin 0000-0002-0831-7825

Dilara Bıyıklı 0009-0001-4474-9588

Feyzanur Aytekin 0009-0003-2344-7657

Hasan Basri Darga 0009-0003-9715-9922

Yayımlanma Tarihi 15 Haziran 2025
Gönderilme Tarihi 20 Ağustos 2024
Kabul Tarihi 13 Mart 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 15 Sayı: 2

Kaynak Göster

APA Kahraman, A., Şahin, D. Ö., Bıyıklı, D., … Aytekin, F. (2025). Makine Öğrenmesi ve Derin Öğrenmeye Dayalı Duygu Analizinde Metin Temsil Yöntemlerinin Sınıflandırma Başarımına Etkisinin İncelenmesi. Karadeniz Fen Bilimleri Dergisi, 15(2), 923-959. https://doi.org/10.31466/kfbd.1536270