EN
TR
CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS
Abstract
Automatic text classification and sentiment polarity detection are two important research problems of social media analysis. The meanings of the words are so important that they need to be captured by a document classification algorithm to reach an accurate classification performance. Another important issue with the text classification is the scarcity of labeled data. In this study, Combining Labeled and Unlabeled Data with Semantic Values of Terms (CLUDS) is presented. CLUDS has the following steps: preprocessing, instance labeling, combining labeled and unlabeled data, and prediction. In preprocessing step Latent Dirichlet Allocation (LDA) algorithm is used. In instance labeling step Logistic Regression is applied. In CLUDS, relevance values computation has been applied as a supervised term weighting methodology in the text classification field. Still, according to the literature, CLUDS is the first attempt that uses both relevance and weighting calculation in a semi-supervised semantic kernel for Support Vector Machines (SVM). In this study, Sprinkled-CLUDS and Adaptive-Sprinkled-CLUDS have also been implemented. Evaluated experimental results show that CLUDS, Sprinkled-CLUDS and Adaptive-Sprinkled-CLUDS generate a valuable performance gain over the baseline algorithms on test sets.
Keywords
- Tweet Classification
- Latent Dirichlet Allocation
- Logistic Regression
- Social Media Analysis
- Sentiment Polarity Detection
Project Number
118E315
References
- Ahmed, I., Ali, R., Guan, D., Lee, Y., Lee, S., Chung, T. 2015. Semi-Supervised Learning Using Frequent Itemset and Ensemble Learning for SMS Classification. Expert Systems with Applications, 42(3), 1065-1073.
- Akın, A. A., & Akın, M. D., 2007. Zemberek, an open source nlp framework for Turkish languages. Structure, 10, 1-5.
- Alsmadi, I., & Hoon, G. K., 2019. Term weighting scheme for short-text classification: Twitter corpuses. Neural Computing and Applications, 31(8), 3819-3831.
- Altınel, B., Diri, B., Ganiz, M.C., 2015. A Novel Semantic Smoothing Kernel for Text Classification with Class-based Weighting. Knowledge-Based Systems, 89(1), 265-277.
- Altınel, B., Ganiz, M. C., 2018. Semantic Text Classification: A Survey of Past and Recent Advances. Information Processing & Management, 54(6), 1129-1153.
- Amasyalı, M. F., Beken, A. Türkçe Kelimelerin Anlamsal Benzerliklerinin Ölçülmesi ve Metin Siniflandirmada Kullanilmasi, In Proceedings of IEEE Sinyal İşleme ve İletişim Uygulamalari Kurultayi (SIU), 2009.
- Amor, B. R. , Vuik, S. I. , Callahan, R. , Darzi, A. , Yaliraki, S. N. , & Barahona, M., 2016. Community detection and role identification in directed networks: Understand- ing the twitter network of the care. data debate. In Dynamic networks and cyber.
- Asiaee T, A., Tepper, M., Banerjee, A., & Sapiro, G., 2012. If you are happy and you know it... tweet. In Proceedings of the 21st ACM international conference on Information and knowledge management, 1602-1606.
Details
Primary Language
English
Subjects
Computer Software
Journal Section
Research Article
Authors
Publication Date
December 20, 2021
Submission Date
August 13, 2020
Acceptance Date
September 6, 2021
Published in Issue
Year 2021 Volume: 9 Number: 4
APA
Altınel, A. B. (2021). CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS. Mühendislik Bilimleri Ve Tasarım Dergisi, 9(4), 1048-1061. https://doi.org/10.21923/jesd.780002
AMA
1.Altınel AB. CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS. JESD. 2021;9(4):1048-1061. doi:10.21923/jesd.780002
Chicago
Altınel, Ayşe Berna. 2021. “CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS”. Mühendislik Bilimleri Ve Tasarım Dergisi 9 (4): 1048-61. https://doi.org/10.21923/jesd.780002.
EndNote
Altınel AB (December 1, 2021) CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS. Mühendislik Bilimleri ve Tasarım Dergisi 9 4 1048–1061.
IEEE
[1]A. B. Altınel, “CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS”, JESD, vol. 9, no. 4, pp. 1048–1061, Dec. 2021, doi: 10.21923/jesd.780002.
ISNAD
Altınel, Ayşe Berna. “CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS”. Mühendislik Bilimleri ve Tasarım Dergisi 9/4 (December 1, 2021): 1048-1061. https://doi.org/10.21923/jesd.780002.
JAMA
1.Altınel AB. CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS. JESD. 2021;9:1048–1061.
MLA
Altınel, Ayşe Berna. “CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS”. Mühendislik Bilimleri Ve Tasarım Dergisi, vol. 9, no. 4, Dec. 2021, pp. 1048-61, doi:10.21923/jesd.780002.
Vancouver
1.Ayşe Berna Altınel. CLUDS: COMBINING LABELED AND UNLABELED DATA WITH LOGISTIC REGRESSION FOR SOCIAL MEDIA ANALYSIS. JESD. 2021 Dec. 1;9(4):1048-61. doi:10.21923/jesd.780002
Cited By
TÜRKÇE KONUŞMADA DUYGU TANIMA İÇİN MAKİNE ÖĞRENME YÖNTEMLERİ VE DERİN ÖĞRENME TABANLI MODELLERİN KARŞILAŞTIRILMASI
Mühendislik Bilimleri ve Tasarım Dergisi
https://doi.org/10.21923/jesd.1350375