A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA

Rasim Çekik

doi:10.31796/ogummf.1420509

Research Article

COVID-19 VERİLERİ ÜZERİNDE KABA KÜME KULLANARAK KURAL TABANLI BİR YAKLAŞIMLA DUYGU ANALİZİ

Year 2024, , 1363 - 1375, 12.08.2024

Rasim Çekik

https://doi.org/10.31796/ogummf.1420509

Abstract

COVID-19 salgını, sadece can kaybına neden olmakla kalmayıp aynı zamanda insanların duygusal durumlarını da önemli ölçüde etkilemiştir. Bu duygusal etkiler, dünya çapındaki toplumlar ve ekonomiler üzerinde ciddi sonuçlar doğurmuştur. Toplumda meydana gelen bu yıkımların onarılabilmesi için, bu duygusal etkilerin derinlemesine incelenmesi önemlidir. Bu çalışmada, salgının insan duyguları üzerindeki etkileri, yumuşak hesaplama teknikleri kullanılarak analiz edilmiştir. Analiz için kaba küme yardımıyla kural tabanlı bir yaklaşım önerilmiştir. Önerilen yöntem, iki temel bileşen üzerine kurulmuştur. Birincisi, k tane en iyi bilinen öznitelik seçme yaklaşımı yardımı ile tüm özellik kümesinden en uygun alt küme (OFS) seçme işlemdir. İkinci bileşen ise, seçilen özellik alt kümesi OFS üzerinde kurallar oluşturmak için kaba kümeleme yöntemlerinin kullanılmasını içermektedir. Çalışmada, COVID-19'a verilen duygusal tepkilerden elde edilen “Gerçek Dünya Endişe Veri Kümesi” adlı ilk elverişli gerçek veri kümesi kullanılmıştır. Veri kümesi 5.000 parçadan (2.500 kısa + 2.500 uzun) oluşmaktadır. Deneysel çalışmalarda, önerilen yaklaşımın hem etiketli hem de etiketsiz verilerle test edildiği ve %85'in üzerinde bir doğruluk oranıyla etkili sonuçlar elde edildiği gözlemlenmiştir. Ayrıca, insanların salgın nedeniyle geleceğe yönelik yüksek oranda endişe duydukları belirlenmiştir.

Keywords

COVID-19, Sentiment Analysis, Rough Set, Text Mining

References

Acheampong, F. A., Wenyu, C., & Nunoo‐Mensah, H. (2020). Text‐based emotion detection: Advances, challenges, and opportunities. Engineering Reports, 2(7), e12189.
Arulmurugan, R., Sabarmathi, K. R., & Anandakumar, H. (2019). RETRACTED ARTICLE: Classification of sentence level sentiment analysis using cloud machine learning techniques. Cluster Computing, 22(Suppl 1), 1199–1209.
Aue, A., & Gamon, M. (2005). Customizing sentiment classifiers to new domains: A case study. Proceedings of Recent Advances in Natural Language Processing (RANLP), 1(3.1), 1–2.
Banda, J., Tekumalla, R., Wang, G., Yu, J., Liu, T., Ding, Y., & Chowell, G. (2020). A Twitter dataset of 150+ million tweets related to COVID-19 for open research, April 5.
Blitzer, J., Dredze, M., & Pereira, F. (2007). Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 440–447.
ÇEKİK, R. (2022). Metin Siniflandirma İçin Filtre Öznitelik Seçim Yaklaşimlari. Mühendislik Alanında Uluslararası Araştırmalar II, 87.
ÇEKİK, R., & Mahmut, K. (2023). A New Feature Selection Metric Based on Rough Sets and Information Gain in Text Classification. Gazi University Journal of Science Part A: Engineering and Innovation, 10(4), 472–486.
Cekik, R., & Uysal, A. K. (2020). A novel filter feature selection method using rough set for short text data. Expert Systems with Applications, 160, 113691.
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28.
Chen, E., Lerman, K., & Ferrara, E. (2020). Covid-19: The first public coronavirus twitter dataset.
Devi Sri Nandhini, M., & Pradeep, G. (2020). A hybrid co-occurrence and ranking-based approach for detection of implicit aspects in aspect-based sentiment analysis. SN Computer Science, 1, 1–9.
Garcia-Garcia, J. M., Penichet, V. M. R., & Lozano, M. D. (2017). Emotion detection: a technology review. Proceedings of the XVIII International Conference on Human Computer Interaction, 1–8.
Grzymala-Busse, J. W. (1992). LERS-a system for learning from examples based on rough sets. Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory, 3–18.
Grzymala-Busse, J. W. (2015). Rule induction from rough approximations. Springer Handbook of Computational Intelligence, 371–385.
Ibrahim, H. S., Abdou, S. M., & Gheith, M. (2015). Idioms-proverbs lexicon for modern standard Arabic and colloquial sentiment analysis. ArXiv Preprint ArXiv:1506.01906.
Klebanov, B. B., Burstein, J., & Madnani, N. (2013). Sentiment profiles of multiword expressions in test-taker essays: The case of noun-noun compounds. ACM Transactions on Speech and Language Processing (TSLP), 10(3), 1–15.
Kleinberg, B., Van Der Vegt, I., & Mozes, M. (2020). Measuring emotions in the covid-19 real world worry dataset. ArXiv Preprint ArXiv:2004.04225.
Liu, B. (2010). Sentiment analysis and subjectivity. Handbook of Natural Language Processing, 2(2010), 627–666.
Liu, S., Lee, K., & Lee, I. (2020). Document-level multi-topic sentiment classification of email data with bilstm and data augmentation. Knowledge-Based Systems, 197, 105918.
Meena, A., & Prabhakar, T. V. (2007). Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. Advances in Information Retrieval: 29th European Conference on IR Research, ECIR 2007, Rome, Italy, April 2-5, 2007. Proceedings 29, 573–580.
Pak, M. Y. (2015). Metinlerde duygu analizi ve sınıflandırma için yeni yöntemler. Anadolu University (Turkey).
Parlak, B., & Uysal, A. K. (2020). On classification of abstracts obtained from medical journals. Journal of Information Science, 46(5), 648–663.
Parlak, B., & Uysal, A. K. (2023). A novel filter feature selection method for text classification: Extensive Feature Selector. Journal of Information Science, 49(1), 59–78.
Pawlak, Z. (1998). Rough set theory and its applications to data analysis. Cybernetics & Systems, 29(7), 661–688.
Sarsam, S. M., Al-Samarraie, H., Alzahrani, A. I., Alnumay, W., & Smith, A. P. (2021). A lexicon-based approach to detecting suicide-related messages on Twitter. Biomedical Signal Processing and Control, 65, 102355.
Schouten, K., & Frasincar, F. (2015). Survey on aspect-level sentiment analysis. IEEE Transactions on Knowledge and Data Engineering, 28(3), 813–830.
Stefanowski, J. (1998). On rough set based approaches to induction of decision rules. Rough Sets in Knowledge Discovery, 1(1), 500–529.
Stefanowski, J. (2001). Algorithms of decision rule induction in data mining. Poznan University of Technology Press, Poznan, Poland.
Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. ArXiv Preprint Cs/0212032.
Uysal, A. K., & Gunal, S. (2012). A novel probabilistic feature selection method for text classification. Knowledge-Based Systems, 36, 226–235.
Williams, L., Bannister, C., Arribas-Ayllon, M., Preece, A., & Spasić, I. (2015). The role of idioms in sentiment analysis. Expert Systems with Applications, 42(21), 7375–7385.
Zhang, S., Zhang, X., Chan, J., & Rosso, P. (2019). Irony detection via sentiment-based transfer learning. Information Processing & Management, 56(5), 1633–1644.

A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA

Year 2024, , 1363 - 1375, 12.08.2024

Rasim Çekik

https://doi.org/10.31796/ogummf.1420509

Abstract

The COVID-19 pandemic has not only caused loss of life but also significantly affected people's emotional state. These emotional impacts have had serious consequences on societies and economies around the world. In order to repair these devastations in society, it is important to analyse these emotional effects in depth. In this study, the effects of the pandemic on human emotions are analysed using soft computing techniques. A rule-based approach is proposed for the analysis with the help of a rough set. The proposed method is based on two main components. The first one is the process of selecting the optimal subset (OFS) from the whole feature set with the help of k best known feature selection approaches. The second component involves the use of rough clustering methods to generate rules on the selected feature subset OFS. In the study, the first real data set called " Real World Concern Dataset", which is obtained from emotional responses to COVID-19, was used. The dataset consists of 5,000 items (2,500 short + 2,500 long). In the experimental studies, the proposed approach was tested with both labelled and unlabelled data, and it was observed that effective results were obtained with an accuracy rate of over 85%. It was also found that people were highly concerned about the future due to the pandemic.

Keywords

COVID-19, Sentiment Analysis, Rough Set, Text Mining

References

Acheampong, F. A., Wenyu, C., & Nunoo‐Mensah, H. (2020). Text‐based emotion detection: Advances, challenges, and opportunities. Engineering Reports, 2(7), e12189.
Arulmurugan, R., Sabarmathi, K. R., & Anandakumar, H. (2019). RETRACTED ARTICLE: Classification of sentence level sentiment analysis using cloud machine learning techniques. Cluster Computing, 22(Suppl 1), 1199–1209.
Aue, A., & Gamon, M. (2005). Customizing sentiment classifiers to new domains: A case study. Proceedings of Recent Advances in Natural Language Processing (RANLP), 1(3.1), 1–2.
Banda, J., Tekumalla, R., Wang, G., Yu, J., Liu, T., Ding, Y., & Chowell, G. (2020). A Twitter dataset of 150+ million tweets related to COVID-19 for open research, April 5.
Blitzer, J., Dredze, M., & Pereira, F. (2007). Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, 440–447.
ÇEKİK, R. (2022). Metin Siniflandirma İçin Filtre Öznitelik Seçim Yaklaşimlari. Mühendislik Alanında Uluslararası Araştırmalar II, 87.
ÇEKİK, R., & Mahmut, K. (2023). A New Feature Selection Metric Based on Rough Sets and Information Gain in Text Classification. Gazi University Journal of Science Part A: Engineering and Innovation, 10(4), 472–486.
Cekik, R., & Uysal, A. K. (2020). A novel filter feature selection method using rough set for short text data. Expert Systems with Applications, 160, 113691.
Chandrashekar, G., & Sahin, F. (2014). A survey on feature selection methods. Computers & Electrical Engineering, 40(1), 16–28.
Chen, E., Lerman, K., & Ferrara, E. (2020). Covid-19: The first public coronavirus twitter dataset.
Devi Sri Nandhini, M., & Pradeep, G. (2020). A hybrid co-occurrence and ranking-based approach for detection of implicit aspects in aspect-based sentiment analysis. SN Computer Science, 1, 1–9.
Garcia-Garcia, J. M., Penichet, V. M. R., & Lozano, M. D. (2017). Emotion detection: a technology review. Proceedings of the XVIII International Conference on Human Computer Interaction, 1–8.
Grzymala-Busse, J. W. (1992). LERS-a system for learning from examples based on rough sets. Intelligent Decision Support: Handbook of Applications and Advances of the Rough Sets Theory, 3–18.
Grzymala-Busse, J. W. (2015). Rule induction from rough approximations. Springer Handbook of Computational Intelligence, 371–385.
Ibrahim, H. S., Abdou, S. M., & Gheith, M. (2015). Idioms-proverbs lexicon for modern standard Arabic and colloquial sentiment analysis. ArXiv Preprint ArXiv:1506.01906.
Klebanov, B. B., Burstein, J., & Madnani, N. (2013). Sentiment profiles of multiword expressions in test-taker essays: The case of noun-noun compounds. ACM Transactions on Speech and Language Processing (TSLP), 10(3), 1–15.
Kleinberg, B., Van Der Vegt, I., & Mozes, M. (2020). Measuring emotions in the covid-19 real world worry dataset. ArXiv Preprint ArXiv:2004.04225.
Liu, B. (2010). Sentiment analysis and subjectivity. Handbook of Natural Language Processing, 2(2010), 627–666.
Liu, S., Lee, K., & Lee, I. (2020). Document-level multi-topic sentiment classification of email data with bilstm and data augmentation. Knowledge-Based Systems, 197, 105918.
Meena, A., & Prabhakar, T. V. (2007). Sentence level sentiment analysis in the presence of conjuncts using linguistic analysis. Advances in Information Retrieval: 29th European Conference on IR Research, ECIR 2007, Rome, Italy, April 2-5, 2007. Proceedings 29, 573–580.
Pak, M. Y. (2015). Metinlerde duygu analizi ve sınıflandırma için yeni yöntemler. Anadolu University (Turkey).
Parlak, B., & Uysal, A. K. (2020). On classification of abstracts obtained from medical journals. Journal of Information Science, 46(5), 648–663.
Parlak, B., & Uysal, A. K. (2023). A novel filter feature selection method for text classification: Extensive Feature Selector. Journal of Information Science, 49(1), 59–78.
Pawlak, Z. (1998). Rough set theory and its applications to data analysis. Cybernetics & Systems, 29(7), 661–688.
Sarsam, S. M., Al-Samarraie, H., Alzahrani, A. I., Alnumay, W., & Smith, A. P. (2021). A lexicon-based approach to detecting suicide-related messages on Twitter. Biomedical Signal Processing and Control, 65, 102355.
Schouten, K., & Frasincar, F. (2015). Survey on aspect-level sentiment analysis. IEEE Transactions on Knowledge and Data Engineering, 28(3), 813–830.
Stefanowski, J. (1998). On rough set based approaches to induction of decision rules. Rough Sets in Knowledge Discovery, 1(1), 500–529.
Stefanowski, J. (2001). Algorithms of decision rule induction in data mining. Poznan University of Technology Press, Poznan, Poland.
Turney, P. D. (2002). Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. ArXiv Preprint Cs/0212032.
Uysal, A. K., & Gunal, S. (2012). A novel probabilistic feature selection method for text classification. Knowledge-Based Systems, 36, 226–235.
Williams, L., Bannister, C., Arribas-Ayllon, M., Preece, A., & Spasić, I. (2015). The role of idioms in sentiment analysis. Expert Systems with Applications, 42(21), 7375–7385.
Zhang, S., Zhang, X., Chan, J., & Rosso, P. (2019). Irony detection via sentiment-based transfer learning. Information Processing & Management, 56(5), 1633–1644.

There are 32 citations in total.

Details

Primary Language	English
Subjects	Computer Software
Journal Section	Research Articles
Authors	Rasim Çekik 0000-0002-7820-413X
Early Pub Date	August 6, 2024
Publication Date	August 12, 2024
Submission Date	January 15, 2024
Acceptance Date	May 16, 2024
Published in Issue	Year 2024

Cite

APA	Çekik, R. (2024). A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA. Eskişehir Osmangazi Üniversitesi Mühendislik Ve Mimarlık Fakültesi Dergisi, 32(2), 1363-1375. https://doi.org/10.31796/ogummf.1420509
AMA	Çekik R. A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA. ESOGÜ Müh Mim Fak Derg. August 2024;32(2):1363-1375. doi:10.31796/ogummf.1420509
Chicago	Çekik, Rasim. “A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA”. Eskişehir Osmangazi Üniversitesi Mühendislik Ve Mimarlık Fakültesi Dergisi 32, no. 2 (August 2024): 1363-75. https://doi.org/10.31796/ogummf.1420509.
EndNote	Çekik R (August 1, 2024) A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA. Eskişehir Osmangazi Üniversitesi Mühendislik ve Mimarlık Fakültesi Dergisi 32 2 1363–1375.
IEEE	R. Çekik, “A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA”, ESOGÜ Müh Mim Fak Derg, vol. 32, no. 2, pp. 1363–1375, 2024, doi: 10.31796/ogummf.1420509.
ISNAD	Çekik, Rasim. “A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA”. Eskişehir Osmangazi Üniversitesi Mühendislik ve Mimarlık Fakültesi Dergisi 32/2 (August 2024), 1363-1375. https://doi.org/10.31796/ogummf.1420509.
JAMA	Çekik R. A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA. ESOGÜ Müh Mim Fak Derg. 2024;32:1363–1375.
MLA	Çekik, Rasim. “A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA”. Eskişehir Osmangazi Üniversitesi Mühendislik Ve Mimarlık Fakültesi Dergisi, vol. 32, no. 2, 2024, pp. 1363-75, doi:10.31796/ogummf.1420509.
Vancouver	Çekik R. A RULE-BASED APPROACH USING THE ROUGH SET ON COVID-19 DATA. ESOGÜ Müh Mim Fak Derg. 2024;32(2):1363-75.

Article Files

Full Text