Research Article

ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING

Volume: 10 December 16, 2022
TR EN

ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING

Abstract

Accessing data is very easy nowadays. However, to use these data in an efficient way, it is necessary to get the right information from them. Categorizing these data in order to reach the needed information in a short time provides great convenience. All the more, while doing research in the academic field, text-based data such as articles, papers, or thesis studies are generally used. Natural language processing and machine learning methods are used to get the right information we need from these text-based data. In this study, abstracts of academic papers are clustered. Text data from academic paper abstracts are preprocessed using natural language processing techniques. A vectorized word representation extracted from preprocessed data with Word2Vec and BERT word embeddings and representations are clustered with four clustering algorithms.

Keywords

References

  1. Adalı, E. (2012). Doğal Dil İşleme. Türkiye Bilişim Vakfı Bilgisayar Bilimleri ve Mühendisliği Dergisi, 5(2).
  2. Aggarwal, C. C., & Zhai, C. (2012). A survey of text clustering algorithms. In Mining text data (pp. 77-128): Springer.
  3. Alexandrov, M., Gelbukh, A., & Rosso, P. (2005). An approach to clustering abstracts. Paper presented at the International Conference on Application of Natural Language to Information Systems.
  4. Amasyali, M. F., Balc1, S., Mete, E., & Varl1, E. N. (2012). Türkçe Metinlerin Sınıflandırılmasında Metin Temsil Yöntemlerinin Performans Karşılaştırılması / A Comparison of Text Representation Methods for Turkish Text Classification.
  5. Amasyalı, M. F., & Diri, B. (2006). Automatic Turkish text categorization in terms of author, genre and gender. International Conference on Application of Natural Language to Information Systems,
  6. Ankerst, M., Breunig, M. M., Kriegel, H.-P., & Sander, J. (1999). OPTICS: Ordering points to identify the clustering structure. ACM Sigmod record, 28(2), 49-60.
  7. Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Pérez, J. M., & Perona, I. (2013). An extensive comparative study of cluster validity indices. Pattern Recognition, 46(1), 243-256.
  8. Bekkerman, R., El-Yaniv, R., Tishby, N., & Winter, Y. (2003). Distributional Word Clusters vs. Words for Text Categorization. J. Mach. Learn. Res., 3, 1183-1208.

Details

Primary Language

English

Subjects

Engineering

Journal Section

Research Article

Publication Date

December 16, 2022

Submission Date

March 1, 2022

Acceptance Date

November 16, 2022

Published in Issue

Year 2022 Volume: 10

APA
Taşkıran, S. F., & Kaya, E. (2022). ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING. Konya Journal of Engineering Sciences, 10, 41-51. https://doi.org/10.36306/konjes.1081213
AMA
1.Taşkıran SF, Kaya E. ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING. KONJES. 2022;10:41-51. doi:10.36306/konjes.1081213
Chicago
Taşkıran, Salimkan Fatma, and Ersin Kaya. 2022. “ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING”. Konya Journal of Engineering Sciences 10 (December): 41-51. https://doi.org/10.36306/konjes.1081213.
EndNote
Taşkıran SF, Kaya E (December 1, 2022) ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING. Konya Journal of Engineering Sciences 10 41–51.
IEEE
[1]S. F. Taşkıran and E. Kaya, “ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING”, KONJES, vol. 10, pp. 41–51, Dec. 2022, doi: 10.36306/konjes.1081213.
ISNAD
Taşkıran, Salimkan Fatma - Kaya, Ersin. “ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING”. Konya Journal of Engineering Sciences 10 (December 1, 2022): 41-51. https://doi.org/10.36306/konjes.1081213.
JAMA
1.Taşkıran SF, Kaya E. ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING. KONJES. 2022;10:41–51.
MLA
Taşkıran, Salimkan Fatma, and Ersin Kaya. “ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING”. Konya Journal of Engineering Sciences, vol. 10, Dec. 2022, pp. 41-51, doi:10.36306/konjes.1081213.
Vancouver
1.Salimkan Fatma Taşkıran, Ersin Kaya. ACADEMIC TEXT CLUSTERING USING NATURAL LANGUAGE PROCESSING. KONJES. 2022 Dec. 1;10:41-5. doi:10.36306/konjes.1081213

Cited By