Research Article
BibTex RIS Cite

Ağırlıklı Çoğunluk Oylama Topluluğu Yöntemini Kullanan Bir Metin Madenciliği Uygulaması

Year 2024, Volume: 26 Issue: 78, 440 - 448, 27.09.2024
https://doi.org/10.21205/deufmd.2024267810

Abstract

Metin madenciliğinde, yakın zamanda tanıtılmasına rağmen duygu analizi gün geçtikçe popülerlik kazanmaktadır. Bu araştırmanın önemli geri bildirim parametrelerinden biri, metin tabanlı bir içerik hakkındaki görüşlerdir. Bu konudaki genel amaç, ürün ve hizmet incelemelerini veya yorumlarını, aldıkları puanlar aracılığıyla birbirleriyle karşılaştırabilmeleri ve karşılaştırabilmeleri için analiz etmektir. Farklı geleneksel tek makine öğrenimi modellerinin sınıflandırma doğruluğunu artırmak için bu çalışmada daha önce önerdiğimiz bir topluluk yöntemi kullanılmıştır. Veri madenciliği tekniğinin sınıflandırma puanını artırmak için birbiriyle ilişkili ancak aynı olmayan beş analitik model uygulanmış ve bunların sınıf kararları, Ağırlıklı Çoğunluk Oylaması (WMVE) adı verilen özel ağırlıklı çoğunluk oylama topluluğu mekanizması kullanılarak entegre edilmiştir. Toplulukta temel sınıflandırıcılar olarak Naïve Bayes, OneR, Hoefding Tree, REPTree ve KNN yöntemleri kullanılmış ve bunların sınıf kararı WMVE yöntemi için entegre edilmiştir. Aynı zamanda sonuçlar aynı sınıflandırıcılarla oluşturulan Standart Çoğunluk Oylaması (MV) bulgularıyla da kıyaslanmıştır. Bulgulara göre, WMVE modeli, diğer sınıflandırıcılara kıyasla üstün performans sergiledi ve ortalama doğruluk değeri olarak 77.35 ve F-Skoru olarak 77.19 değerlerine ulaştı. Sonuç olarak, duygu analizi sınıflandırma doğruluğunu artırmak için ağırlıklı oylama yöntemini içeren topluluk modeli kullanılır.

References

  • [1] Basiri, E., Safarian, N., Farsani, E. 2019. A supervised framework for review spam detection in the Persian language, In: 2019 5th International Conference on Web Research (ICWR), 24-25 April, Tahran, Iran, 203-207.
  • [2] Juyal, P. 2022. Classification accuracy in sentiment analysis using hybrid and ensemble methods. 2022 IEEE World Conference on Applied Intelligence and Computing (AIC), 17-19 June, Sonbhadra, India, 583-587.
  • [3] Raza, N., Bharti, S., Ritika, M. 2023. Detecting the risk of Covid 19 Spread in near real-time using social media. International Journal of Emergency Management, vol. 18(2), 202-223. https://doi.org/: 10.1504/IJEM.2023.131940
  • [4] Nona, N., Julien, K., Jenny, C., Patrick, R., Douglas, T. 2021. Ensemble of deep masked language models for effective named entity recognition in health and life science corpora, vol. 6, 689803. doi: 10.3389/frma.2021.689803
  • [5] McAuley, J., Leskovec, J. 2013. From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews, https://doi.org/10.48550/arXiv.1303.4402
  • [6] Alharbi, N.M., Alghamdi, N.S., Alkhammash, E.H., Al Amri, J.F. 2021. Evaluation of sentiment analysis via word embedding and RNN variants for Amazon online reviews, Mathematical Problems in Engineering, vol. 2021, 1-10. https://doi.org/10.1155/2021/5536560
  • [7] Gondhi, N.K., Sharma, E., Alharbi, A.H., Verma, R., Shah. M.A. 2022, Efficient long short-term memory-based sentiment analysis of e-commerce reviews, Computational Intelligence and Neuroscience, vol. 2022,3464524. https://doi.org/10.1155/2022/3464524
  • [8] Dey, S., Wasif, S., Tonmoy, DS., Sultana, S., Sarkar, J., Dey, M. 2020. A comparative study of support vector machine and Naive Bayes classifier for sentiment analysis on Amazon product reviews, In: 2020 International Conference on Contemporary Computing and Applications, 05-07 February, Lucknow, India, 217-220.
  • [9] Khalid, M., Ashraf, I., Mehmood, A., Ullah, S., Ahmad M., Choi, GS. 2020, GBSVM: sentiment classification from unstructured reviews using ensemble classifier, Applied Sciences, vol. 10(8), 2788. https://doi.org/10.3390/app10082788
  • [10] Qorich, M., El Ouazzani, R. 2023, Text sentiment classification of Amazon reviews using word embeddings and convolutional neural networks, The Journal of Supercomputing, vol. 79, 11029–11054. https://doi.org/10.1007/s11227-023-05094-6
  • [11] Nandal, N., Tanwar, R., Pruthi, J. 2020, Machine learning based aspect level sentiment analysis for Amazon products, Spatial Information Research, vol. 28(5), 601-607. https://doi.org/10.1007/s41324-020-00320-2
  • [12] Alroobaea, R. 2022 Sentiment analysis on amazon product reviews using the recurrent neural network (rnn), International Journal of Advanced Computer Science and Applications, vol. 13(4), 5536560. https://doi.org/10.1155/2021/5536560
  • [13] Dogan, A, Birant, D. 2019. A weighted majority voting ensemble approach for classification. 2019 4th International Conference on Computer Science and Engineering (UBMK), 11-15 September, Samsun, Turkey, 1-6. doi: 10.1109/UBMK.2019.8907028
  • [14] Onan, A., Korukoglu, S., Bulut, H. 2016. Ensemble of keyword extraction methods and classifiers in text classification. Expert Systems with Applications, vol. 57, 232-247. Doi: 10.1016/j.eswa.2016.03.045
  • [15] Bird, S., Loper, E. 2016. The natural language toolkit NLTK: The natural language toolkit, In: Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, March, 63–70.
  • [16] Frank, E., Hall, M.A., Witten, I.H. 2016. The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, Fourth Edition.

A Text Mining Application Using Weighted Majority Voting Ensemble Method

Year 2024, Volume: 26 Issue: 78, 440 - 448, 27.09.2024
https://doi.org/10.21205/deufmd.2024267810

Abstract

In text mining, sentiment analysis is gaining popularity day by day although it has been recently introduced. One of the important feedback parameters of this research is the opinion about text-based content. The general goal in this aspect is to analyze product and service reviews or comments so that they can be compared and contrasted with each other via the ratings they get. An ensemble method which we have proposed earlier is used in this study to boost the classification accuracy of different conventional single machine learning models. Five analytical models that are related but not identical are implemented and their class decisions are integrated using a special weighted majority voting ensemble mechanism called WMVE to increase the classification score of the data mining technique. Naïve Bayes, OneR, Hoefding Tree, REPTree, and KNN methods are utilized as base classifiers in the ensemble and their class decision are integrated into the WMVE method. At the same time, outputs were compared to the ones obtained by Standard Majority Voting Ensemble (MV) including the same base classifiers. Based on the findings, the WMVE model demonstrated superior performance compared to other classifiers, achieving an average accuracy of 77.35 and F-Score of 77.19 values. Consequently, the ensemble model including WMVE is used to enhance sentiment analysis classification performance.

References

  • [1] Basiri, E., Safarian, N., Farsani, E. 2019. A supervised framework for review spam detection in the Persian language, In: 2019 5th International Conference on Web Research (ICWR), 24-25 April, Tahran, Iran, 203-207.
  • [2] Juyal, P. 2022. Classification accuracy in sentiment analysis using hybrid and ensemble methods. 2022 IEEE World Conference on Applied Intelligence and Computing (AIC), 17-19 June, Sonbhadra, India, 583-587.
  • [3] Raza, N., Bharti, S., Ritika, M. 2023. Detecting the risk of Covid 19 Spread in near real-time using social media. International Journal of Emergency Management, vol. 18(2), 202-223. https://doi.org/: 10.1504/IJEM.2023.131940
  • [4] Nona, N., Julien, K., Jenny, C., Patrick, R., Douglas, T. 2021. Ensemble of deep masked language models for effective named entity recognition in health and life science corpora, vol. 6, 689803. doi: 10.3389/frma.2021.689803
  • [5] McAuley, J., Leskovec, J. 2013. From amateurs to connoisseurs: Modeling the evolution of user expertise through online reviews, https://doi.org/10.48550/arXiv.1303.4402
  • [6] Alharbi, N.M., Alghamdi, N.S., Alkhammash, E.H., Al Amri, J.F. 2021. Evaluation of sentiment analysis via word embedding and RNN variants for Amazon online reviews, Mathematical Problems in Engineering, vol. 2021, 1-10. https://doi.org/10.1155/2021/5536560
  • [7] Gondhi, N.K., Sharma, E., Alharbi, A.H., Verma, R., Shah. M.A. 2022, Efficient long short-term memory-based sentiment analysis of e-commerce reviews, Computational Intelligence and Neuroscience, vol. 2022,3464524. https://doi.org/10.1155/2022/3464524
  • [8] Dey, S., Wasif, S., Tonmoy, DS., Sultana, S., Sarkar, J., Dey, M. 2020. A comparative study of support vector machine and Naive Bayes classifier for sentiment analysis on Amazon product reviews, In: 2020 International Conference on Contemporary Computing and Applications, 05-07 February, Lucknow, India, 217-220.
  • [9] Khalid, M., Ashraf, I., Mehmood, A., Ullah, S., Ahmad M., Choi, GS. 2020, GBSVM: sentiment classification from unstructured reviews using ensemble classifier, Applied Sciences, vol. 10(8), 2788. https://doi.org/10.3390/app10082788
  • [10] Qorich, M., El Ouazzani, R. 2023, Text sentiment classification of Amazon reviews using word embeddings and convolutional neural networks, The Journal of Supercomputing, vol. 79, 11029–11054. https://doi.org/10.1007/s11227-023-05094-6
  • [11] Nandal, N., Tanwar, R., Pruthi, J. 2020, Machine learning based aspect level sentiment analysis for Amazon products, Spatial Information Research, vol. 28(5), 601-607. https://doi.org/10.1007/s41324-020-00320-2
  • [12] Alroobaea, R. 2022 Sentiment analysis on amazon product reviews using the recurrent neural network (rnn), International Journal of Advanced Computer Science and Applications, vol. 13(4), 5536560. https://doi.org/10.1155/2021/5536560
  • [13] Dogan, A, Birant, D. 2019. A weighted majority voting ensemble approach for classification. 2019 4th International Conference on Computer Science and Engineering (UBMK), 11-15 September, Samsun, Turkey, 1-6. doi: 10.1109/UBMK.2019.8907028
  • [14] Onan, A., Korukoglu, S., Bulut, H. 2016. Ensemble of keyword extraction methods and classifiers in text classification. Expert Systems with Applications, vol. 57, 232-247. Doi: 10.1016/j.eswa.2016.03.045
  • [15] Bird, S., Loper, E. 2016. The natural language toolkit NLTK: The natural language toolkit, In: Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics, March, 63–70.
  • [16] Frank, E., Hall, M.A., Witten, I.H. 2016. The WEKA Workbench. Online Appendix for "Data Mining: Practical Machine Learning Tools and Techniques", Morgan Kaufmann, Fourth Edition.
There are 16 citations in total.

Details

Primary Language English
Subjects Computer Vision and Multimedia Computation (Other)
Journal Section Research Article
Authors

Alican Doğan 0000-0002-0553-2888

Mansur Alp Toçoğlu 0000-0003-1784-9003

Early Pub Date September 17, 2024
Publication Date September 27, 2024
Submission Date October 20, 2023
Acceptance Date January 6, 2024
Published in Issue Year 2024 Volume: 26 Issue: 78

Cite

APA Doğan, A., & Toçoğlu, M. A. (2024). A Text Mining Application Using Weighted Majority Voting Ensemble Method. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi, 26(78), 440-448. https://doi.org/10.21205/deufmd.2024267810
AMA Doğan A, Toçoğlu MA. A Text Mining Application Using Weighted Majority Voting Ensemble Method. DEUFMD. September 2024;26(78):440-448. doi:10.21205/deufmd.2024267810
Chicago Doğan, Alican, and Mansur Alp Toçoğlu. “A Text Mining Application Using Weighted Majority Voting Ensemble Method”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi 26, no. 78 (September 2024): 440-48. https://doi.org/10.21205/deufmd.2024267810.
EndNote Doğan A, Toçoğlu MA (September 1, 2024) A Text Mining Application Using Weighted Majority Voting Ensemble Method. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 26 78 440–448.
IEEE A. Doğan and M. A. Toçoğlu, “A Text Mining Application Using Weighted Majority Voting Ensemble Method”, DEUFMD, vol. 26, no. 78, pp. 440–448, 2024, doi: 10.21205/deufmd.2024267810.
ISNAD Doğan, Alican - Toçoğlu, Mansur Alp. “A Text Mining Application Using Weighted Majority Voting Ensemble Method”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi 26/78 (September 2024), 440-448. https://doi.org/10.21205/deufmd.2024267810.
JAMA Doğan A, Toçoğlu MA. A Text Mining Application Using Weighted Majority Voting Ensemble Method. DEUFMD. 2024;26:440–448.
MLA Doğan, Alican and Mansur Alp Toçoğlu. “A Text Mining Application Using Weighted Majority Voting Ensemble Method”. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen Ve Mühendislik Dergisi, vol. 26, no. 78, 2024, pp. 440-8, doi:10.21205/deufmd.2024267810.
Vancouver Doğan A, Toçoğlu MA. A Text Mining Application Using Weighted Majority Voting Ensemble Method. DEUFMD. 2024;26(78):440-8.

Dokuz Eylül Üniversitesi, Mühendislik Fakültesi Dekanlığı Tınaztepe Yerleşkesi, Adatepe Mah. Doğuş Cad. No: 207-I / 35390 Buca-İZMİR.