Doğal dil işleme, dil bileşenlerinin hem şekilsel hem de anlamsal olarak analiz edildiği yöntemlere verilen isimdir. Doğal dil işleme yöntemleri sürekli güncellenmekte ve yeni yöntemler geliştirilmektedir. Bu çalışmada, doğal dil işlemede kullanılan güncel ve popüler kütüphaneler ve bu kütüphanelerde kullanılan yöntemler incelenmiştir. Farklı yöntem ve kütüphaneler karşılaştırmalı olarak açıklanmıştır.
David, 2020. How many languages in the world. [Çevrimiçi]: https://www.ethnologue.com/guides/how-many-languages
Dehkharghani, Rahim & Saygin, Yucel & Yanikoglu, Berrin & Oflazer, Kemal. (2015). SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation. 50. 10.1007/s10579-015-9307-6.
Eryiğit, G. (2014, April). ITU Turkish NLP web service. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 1-4).
Gardner, (2017). A natural language processing platform for building state-of-the-art models. 2020. [Çevrimiçi]: https://allennlp.org/
Honnibal, Matthew, (2017). Spacy Industrial-Strength Natural Language Processing in Python, 2020. [Çevrimiçi]: https://spacy.io/
Liang, J., Koperski, K., Dhillon, N. S., Tusk, C., & Bhatti, S. (2013). U.S. Patent No. 8,594,996. Washington, DC: U.S. Patent and Trademark Office.
Lorai, (2013). TextBlob: Simplified Text Processing, 2020. [Çevrimiçi]: https://textblob.readthedocs.io/en/dev/
Majumder, P., Mitra, M., & Chaudhuri, B. B. (2002). N-gram: a language independent approach to IR and NLP. In International conference on universal knowledge and language.
McClosky,,Bauer, (2014). Stanford CoreNLP: A Java suite of core NLP tools. 2020. [Çevrimiçi]: https://github.com/stanfordnlp/CoreNLP
Paszke, (2017). Pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration, 2020. [Çevrimiçi]: https://github.com/pytorch/pytorch
Radim, Sojka, (2008). Gensim 3.8.3 Python framework for fast Vector Space Modelling, 2020. [Çevrimiçi]: https://pypi.org/project/gensim/
Sciforce, (2020). Text Preprocessing for NLP and Machine Learning Tasks [Çevrimiçi]: https://medium.com/sciforce/text-preprocessing-for-nlp-and-machine-learning-tasks-3e077aa4946e
Sun, Q., Wang, B., Gu, Z., & Fu, Y. (2018). VECTORIZATION METHODS IN RECOMMENDER SYSTEM.
Webster, (1992). Tokenization as the initial phase in NLP. In COLING 1992 Volume 4: The 15th International Conference on Computational Linguistics.
Zemberek, (2007). NLP tools for Turkish. [Çevrimiçi]: https://github.com/ahmetaa/zemberek-nlp
Natural language processing is a collection of methods in which language components are analyzed both syntactically and semantically. This study presents the set of tools and libraries classified as natural language processing methods. Current and popular libraries used in natural language processing and the methods used in these libraries are comperatively explained.
Kaynakça
Aggarwal, C. C., & Zhai, C. (2012). A survey of text classification algorithms. In Mining text data (pp. 163-222). Springer, Boston, MA.
Akbik, Roland, (2018). Flair A very simple framework for state-of-the-art NLP, 2020. [Çevrimiçi]: https://github.com/flairNLP/flair
Barrus, (2018). Pyspellchecker Pure python spell checker based on work by Peter Norvig, 2020. [Çevrimiçi]: https://pypi.org/project/pyspellchecker/
David, 2020. How many languages in the world. [Çevrimiçi]: https://www.ethnologue.com/guides/how-many-languages
Dehkharghani, Rahim & Saygin, Yucel & Yanikoglu, Berrin & Oflazer, Kemal. (2015). SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation. 50. 10.1007/s10579-015-9307-6.
Eryiğit, G. (2014, April). ITU Turkish NLP web service. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 1-4).
Gardner, (2017). A natural language processing platform for building state-of-the-art models. 2020. [Çevrimiçi]: https://allennlp.org/
Honnibal, Matthew, (2017). Spacy Industrial-Strength Natural Language Processing in Python, 2020. [Çevrimiçi]: https://spacy.io/
Liang, J., Koperski, K., Dhillon, N. S., Tusk, C., & Bhatti, S. (2013). U.S. Patent No. 8,594,996. Washington, DC: U.S. Patent and Trademark Office.
Lorai, (2013). TextBlob: Simplified Text Processing, 2020. [Çevrimiçi]: https://textblob.readthedocs.io/en/dev/
Majumder, P., Mitra, M., & Chaudhuri, B. B. (2002). N-gram: a language independent approach to IR and NLP. In International conference on universal knowledge and language.
McClosky,,Bauer, (2014). Stanford CoreNLP: A Java suite of core NLP tools. 2020. [Çevrimiçi]: https://github.com/stanfordnlp/CoreNLP
Paszke, (2017). Pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration, 2020. [Çevrimiçi]: https://github.com/pytorch/pytorch
Radim, Sojka, (2008). Gensim 3.8.3 Python framework for fast Vector Space Modelling, 2020. [Çevrimiçi]: https://pypi.org/project/gensim/
Sciforce, (2020). Text Preprocessing for NLP and Machine Learning Tasks [Çevrimiçi]: https://medium.com/sciforce/text-preprocessing-for-nlp-and-machine-learning-tasks-3e077aa4946e
Sun, Q., Wang, B., Gu, Z., & Fu, Y. (2018). VECTORIZATION METHODS IN RECOMMENDER SYSTEM.
Webster, (1992). Tokenization as the initial phase in NLP. In COLING 1992 Volume 4: The 15th International Conference on Computational Linguistics.
Zemberek, (2007). NLP tools for Turkish. [Çevrimiçi]: https://github.com/ahmetaa/zemberek-nlp
Yılmaz, H., & Yumuşak, S. (2021). Açık Kaynak Doğal Dil İşleme Kütüphaneleri. İstanbul Sabahattin Zaim Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 3(1), 81-85. https://doi.org/10.47769/izufbed.879217