Konferans Bildirisi
BibTex RIS Kaynak Göster

Açık Kaynak Doğal Dil İşleme Kütüphaneleri

Yıl 2021, , 81 - 85, 30.04.2021
https://doi.org/10.47769/izufbed.879217

Öz

Doğal dil işleme, dil bileşenlerinin hem şekilsel hem de anlamsal olarak analiz edildiği yöntemlere verilen isimdir. Doğal dil işleme yöntemleri sürekli güncellenmekte ve yeni yöntemler geliştirilmektedir. Bu çalışmada, doğal dil işlemede kullanılan güncel ve popüler kütüphaneler ve bu kütüphanelerde kullanılan yöntemler incelenmiştir. Farklı yöntem ve kütüphaneler karşılaştırmalı olarak açıklanmıştır.

Kaynakça

  • Aggarwal, C. C., & Zhai, C. (2012). A survey of text classification algorithms. In Mining text data (pp. 163-222). Springer, Boston, MA.
  • Akbik, Roland, (2018). Flair A very simple framework for state-of-the-art NLP, 2020. [Çevrimiçi]: https://github.com/flairNLP/flair
  • Barrus, (2018). Pyspellchecker Pure python spell checker based on work by Peter Norvig, 2020. [Çevrimiçi]: https://pypi.org/project/pyspellchecker/
  • Bora, (2020). Zemberek Parser , 2019. [Çevrimiçi]: https://github.com/kemalcanbora/zemberek\_parser
  • Buitinck, Louppe, (2013). Scikit-learn 0.23.2 Machine Learning in Python, 2020. [Çevrimiçi]: https://scikit-learn.org/stable/
  • Bird, (2019). Natural Language Toolkit, 2020. [Çevrimiçi]: http://www.nltk.org
  • Cambria, E., Poria, S., Gelbukh, A., & Thelwall, M. (2017). Sentiment analysis is a big suitcase. IEEE Intelligent Systems, 32(6), 74-80.
  • Chowdhary, (2020). Natural language processing. In Fundamentals of Artificial Intelligence (pp. 603-649). Springer, New Delhi.
  • Çetinkaya, (2018). Turkish NLP, 2020. [Çevrimiçi]: https://pypi.org/project/turkishnlp/
  • David, 2020. How many languages in the world. [Çevrimiçi]: https://www.ethnologue.com/guides/how-many-languages
  • Dehkharghani, Rahim & Saygin, Yucel & Yanikoglu, Berrin & Oflazer, Kemal. (2015). SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation. 50. 10.1007/s10579-015-9307-6.
  • Eryiğit, G. (2014, April). ITU Turkish NLP web service. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 1-4).
  • Gardner, (2017). A natural language processing platform for building state-of-the-art models. 2020. [Çevrimiçi]: https://allennlp.org/
  • Honnibal, Matthew, (2017). Spacy Industrial-Strength Natural Language Processing in Python, 2020. [Çevrimiçi]: https://spacy.io/
  • Howard, (2017). Fast.ai , 2020. [Çevrimiçi]: https://www.fast.ai/
  • Jivani, A. G. (2011). A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl, 2(6), 1930-1938.
  • Koksal, (2018). GitHub - akoksal/Turkish-Lemmatizer: Lemmatization for Turkish Language, 2018. [Çevrimiçi]: https://github.com/akoksal/Turkish-Lemmatizer
  • Liang, J., Koperski, K., Dhillon, N. S., Tusk, C., & Bhatti, S. (2013). U.S. Patent No. 8,594,996. Washington, DC: U.S. Patent and Trademark Office.
  • Lorai, (2013). TextBlob: Simplified Text Processing, 2020. [Çevrimiçi]: https://textblob.readthedocs.io/en/dev/
  • Majumder, P., Mitra, M., & Chaudhuri, B. B. (2002). N-gram: a language independent approach to IR and NLP. In International conference on universal knowledge and language.
  • McClosky,,Bauer, (2014). Stanford CoreNLP: A Java suite of core NLP tools. 2020. [Çevrimiçi]: https://github.com/stanfordnlp/CoreNLP
  • Onaldi, (2018). Turkish Stemmer, 2019. [Çevrimiçi]: https://github.com/otuncelli/turkish-stemmer-python/
  • Paszke, (2017). Pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration, 2020. [Çevrimiçi]: https://github.com/pytorch/pytorch
  • Radim, Sojka, (2008). Gensim 3.8.3 Python framework for fast Vector Space Modelling, 2020. [Çevrimiçi]: https://pypi.org/project/gensim/
  • Sciforce, (2020). Text Preprocessing for NLP and Machine Learning Tasks [Çevrimiçi]: https://medium.com/sciforce/text-preprocessing-for-nlp-and-machine-learning-tasks-3e077aa4946e
  • Sun, Q., Wang, B., Gu, Z., & Fu, Y. (2018). VECTORIZATION METHODS IN RECOMMENDER SYSTEM.
  • Webster, (1992). Tokenization as the initial phase in NLP. In COLING 1992 Volume 4: The 15th International Conference on Computational Linguistics.
  • Zemberek, (2007). NLP tools for Turkish. [Çevrimiçi]: https://github.com/ahmetaa/zemberek-nlp

Open Source Natural Language Processing Libraries

Yıl 2021, , 81 - 85, 30.04.2021
https://doi.org/10.47769/izufbed.879217

Öz

Natural language processing is a collection of methods in which language components are analyzed both syntactically and semantically. This study presents the set of tools and libraries classified as natural language processing methods. Current and popular libraries used in natural language processing and the methods used in these libraries are comperatively explained.

Kaynakça

  • Aggarwal, C. C., & Zhai, C. (2012). A survey of text classification algorithms. In Mining text data (pp. 163-222). Springer, Boston, MA.
  • Akbik, Roland, (2018). Flair A very simple framework for state-of-the-art NLP, 2020. [Çevrimiçi]: https://github.com/flairNLP/flair
  • Barrus, (2018). Pyspellchecker Pure python spell checker based on work by Peter Norvig, 2020. [Çevrimiçi]: https://pypi.org/project/pyspellchecker/
  • Bora, (2020). Zemberek Parser , 2019. [Çevrimiçi]: https://github.com/kemalcanbora/zemberek\_parser
  • Buitinck, Louppe, (2013). Scikit-learn 0.23.2 Machine Learning in Python, 2020. [Çevrimiçi]: https://scikit-learn.org/stable/
  • Bird, (2019). Natural Language Toolkit, 2020. [Çevrimiçi]: http://www.nltk.org
  • Cambria, E., Poria, S., Gelbukh, A., & Thelwall, M. (2017). Sentiment analysis is a big suitcase. IEEE Intelligent Systems, 32(6), 74-80.
  • Chowdhary, (2020). Natural language processing. In Fundamentals of Artificial Intelligence (pp. 603-649). Springer, New Delhi.
  • Çetinkaya, (2018). Turkish NLP, 2020. [Çevrimiçi]: https://pypi.org/project/turkishnlp/
  • David, 2020. How many languages in the world. [Çevrimiçi]: https://www.ethnologue.com/guides/how-many-languages
  • Dehkharghani, Rahim & Saygin, Yucel & Yanikoglu, Berrin & Oflazer, Kemal. (2015). SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation. 50. 10.1007/s10579-015-9307-6.
  • Eryiğit, G. (2014, April). ITU Turkish NLP web service. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 1-4).
  • Gardner, (2017). A natural language processing platform for building state-of-the-art models. 2020. [Çevrimiçi]: https://allennlp.org/
  • Honnibal, Matthew, (2017). Spacy Industrial-Strength Natural Language Processing in Python, 2020. [Çevrimiçi]: https://spacy.io/
  • Howard, (2017). Fast.ai , 2020. [Çevrimiçi]: https://www.fast.ai/
  • Jivani, A. G. (2011). A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl, 2(6), 1930-1938.
  • Koksal, (2018). GitHub - akoksal/Turkish-Lemmatizer: Lemmatization for Turkish Language, 2018. [Çevrimiçi]: https://github.com/akoksal/Turkish-Lemmatizer
  • Liang, J., Koperski, K., Dhillon, N. S., Tusk, C., & Bhatti, S. (2013). U.S. Patent No. 8,594,996. Washington, DC: U.S. Patent and Trademark Office.
  • Lorai, (2013). TextBlob: Simplified Text Processing, 2020. [Çevrimiçi]: https://textblob.readthedocs.io/en/dev/
  • Majumder, P., Mitra, M., & Chaudhuri, B. B. (2002). N-gram: a language independent approach to IR and NLP. In International conference on universal knowledge and language.
  • McClosky,,Bauer, (2014). Stanford CoreNLP: A Java suite of core NLP tools. 2020. [Çevrimiçi]: https://github.com/stanfordnlp/CoreNLP
  • Onaldi, (2018). Turkish Stemmer, 2019. [Çevrimiçi]: https://github.com/otuncelli/turkish-stemmer-python/
  • Paszke, (2017). Pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration, 2020. [Çevrimiçi]: https://github.com/pytorch/pytorch
  • Radim, Sojka, (2008). Gensim 3.8.3 Python framework for fast Vector Space Modelling, 2020. [Çevrimiçi]: https://pypi.org/project/gensim/
  • Sciforce, (2020). Text Preprocessing for NLP and Machine Learning Tasks [Çevrimiçi]: https://medium.com/sciforce/text-preprocessing-for-nlp-and-machine-learning-tasks-3e077aa4946e
  • Sun, Q., Wang, B., Gu, Z., & Fu, Y. (2018). VECTORIZATION METHODS IN RECOMMENDER SYSTEM.
  • Webster, (1992). Tokenization as the initial phase in NLP. In COLING 1992 Volume 4: The 15th International Conference on Computational Linguistics.
  • Zemberek, (2007). NLP tools for Turkish. [Çevrimiçi]: https://github.com/ahmetaa/zemberek-nlp
Toplam 28 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Mühendislik
Bölüm Makaleler
Yazarlar

Havva Yılmaz 0000-0002-0217-099X

Semih Yumuşak 0000-0002-8878-4991

Yayımlanma Tarihi 30 Nisan 2021
Gönderilme Tarihi 14 Şubat 2021
Kabul Tarihi 23 Mart 2021
Yayımlandığı Sayı Yıl 2021

Kaynak Göster

APA Yılmaz, H., & Yumuşak, S. (2021). Açık Kaynak Doğal Dil İşleme Kütüphaneleri. İstanbul Sabahattin Zaim Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 3(1), 81-85. https://doi.org/10.47769/izufbed.879217

20503

Bu eser Creative Commons Atıf-GayriTicari 4.0 Uluslararası Lisansı ile lisanslanmıştır.