EN
TR
Investigating Word Association Mining Techniques
Öz
This study presents the investigation of the effect of conditional entropy, mutual information (MI) values, log-likelihood ratio (LLR), and simple co-occurrences on extracting strong syntagmatic relationships. Experiments are conducted by using the Yelp Academic Dataset, which includes extracted 10.000 restaurant reviews. The mutual information values of word pairs are considered to extract the top syntagmatically related words from the corpus. For this purpose, Spyder 3.3.6 and Python Natural Language Toolkit (NLTK) Library are used. The mutual information values are then compared with simple co-occurrences count. The analysis results indicated that the three Word collocation techniques give similar results and therefore, all of those can be employed for Word collocations effectively.
Anahtar Kelimeler
Kaynakça
- [1] Zhai, C. X., Massung, S., Text Data Management and Analysis- A Practical Introduction to Information Retrieval and Text Mining, ACM Books , 2016.
- [2] Church, KW., Hanks, P., Word Association norms, mutual information and lexicography. Computational Linguistics, ACM Books , 1990.
- [3] Damani, O.P., Improving Pointwise Mutual Information (PMI) by incorporating Significant Co-occurrence. 17th Conference on Computational Natural Language Learning , 2013.
- [4] F. H. Khan, U.Qamar, S. Bashir, SentiMI: Introducing point-wise mutual information with SentiWordNet to improve sentiment polarity detection, Applied Soft Computing 39, 140–153, 2016.
- [5] A.K. Jain, Y. Pandey, Analysis and implementation of sentiment classification using lexical POS markers, Int. J. Comput. Commun. Netw. 2 (1) , 36-40, 2013.
- [6] T. Xu, Q. Peng, Y. Cheng, Identifying the semantic orientation of terms using S-HAL for sentiment analysis, Knowl. Based Syst. 35, 279–289, 2012.
- [7] Manning, C.D., Raghavan, R. and Schütze, H., Introduction to Information Retrieval, Cambridge University Press (2008).
- [8] Garrett, Michael, et al. "Leveraging mutual information to generate domain specific lexicons." Proceedings of the International Conference on Social Computing, Behavioral-Cultural Modeling, & Prediction and Behavior Representation in Modeling and Simulation, Washington DC, USA. 2018.
Ayrıntılar
Birincil Dil
İngilizce
Konular
Mühendislik
Bölüm
Araştırma Makalesi
Yayımlanma Tarihi
25 Aralık 2022
Gönderilme Tarihi
7 Ekim 2022
Kabul Tarihi
21 Kasım 2022
Yayımlandığı Sayı
Yıl 2022 Cilt: 5 Sayı: 2
APA
Bağcı Daş, D., & Genç, S. (2022). Investigating Word Association Mining Techniques. Veri Bilimi, 5(2), 106-114. https://izlik.org/JA48CY66AC
AMA
1.Bağcı Daş D, Genç S. Investigating Word Association Mining Techniques. Veri Bilim Derg. 2022;5(2):106-114. https://izlik.org/JA48CY66AC
Chicago
Bağcı Daş, Duygu, ve Sevdanur Genç. 2022. “Investigating Word Association Mining Techniques”. Veri Bilimi 5 (2): 106-14. https://izlik.org/JA48CY66AC.
EndNote
Bağcı Daş D, Genç S (01 Aralık 2022) Investigating Word Association Mining Techniques. Veri Bilimi 5 2 106–114.
IEEE
[1]D. Bağcı Daş ve S. Genç, “Investigating Word Association Mining Techniques”, Veri Bilim Derg, c. 5, sy 2, ss. 106–114, Ara. 2022, [çevrimiçi]. Erişim adresi: https://izlik.org/JA48CY66AC
ISNAD
Bağcı Daş, Duygu - Genç, Sevdanur. “Investigating Word Association Mining Techniques”. Veri Bilimi 5/2 (01 Aralık 2022): 106-114. https://izlik.org/JA48CY66AC.
JAMA
1.Bağcı Daş D, Genç S. Investigating Word Association Mining Techniques. Veri Bilim Derg. 2022;5:106–114.
MLA
Bağcı Daş, Duygu, ve Sevdanur Genç. “Investigating Word Association Mining Techniques”. Veri Bilimi, c. 5, sy 2, Aralık 2022, ss. 106-14, https://izlik.org/JA48CY66AC.
Vancouver
1.Duygu Bağcı Daş, Sevdanur Genç. Investigating Word Association Mining Techniques. Veri Bilim Derg [Internet]. 01 Aralık 2022;5(2):106-14. Erişim adresi: https://izlik.org/JA48CY66AC