Conference Paper
BibTex RIS Cite

Açık Kaynak Doğal Dil İşleme Kütüphaneleri

Year 2021, Volume: 3 Issue: 1, 81 - 85, 30.04.2021
https://doi.org/10.47769/izufbed.879217

Abstract

Doğal dil işleme, dil bileşenlerinin hem şekilsel hem de anlamsal olarak analiz edildiği yöntemlere verilen isimdir. Doğal dil işleme yöntemleri sürekli güncellenmekte ve yeni yöntemler geliştirilmektedir. Bu çalışmada, doğal dil işlemede kullanılan güncel ve popüler kütüphaneler ve bu kütüphanelerde kullanılan yöntemler incelenmiştir. Farklı yöntem ve kütüphaneler karşılaştırmalı olarak açıklanmıştır.

References

  • Aggarwal, C. C., & Zhai, C. (2012). A survey of text classification algorithms. In Mining text data (pp. 163-222). Springer, Boston, MA.
  • Akbik, Roland, (2018). Flair A very simple framework for state-of-the-art NLP, 2020. [Çevrimiçi]: https://github.com/flairNLP/flair
  • Barrus, (2018). Pyspellchecker Pure python spell checker based on work by Peter Norvig, 2020. [Çevrimiçi]: https://pypi.org/project/pyspellchecker/
  • Bora, (2020). Zemberek Parser , 2019. [Çevrimiçi]: https://github.com/kemalcanbora/zemberek\_parser
  • Buitinck, Louppe, (2013). Scikit-learn 0.23.2 Machine Learning in Python, 2020. [Çevrimiçi]: https://scikit-learn.org/stable/
  • Bird, (2019). Natural Language Toolkit, 2020. [Çevrimiçi]: http://www.nltk.org
  • Cambria, E., Poria, S., Gelbukh, A., & Thelwall, M. (2017). Sentiment analysis is a big suitcase. IEEE Intelligent Systems, 32(6), 74-80.
  • Chowdhary, (2020). Natural language processing. In Fundamentals of Artificial Intelligence (pp. 603-649). Springer, New Delhi.
  • Çetinkaya, (2018). Turkish NLP, 2020. [Çevrimiçi]: https://pypi.org/project/turkishnlp/
  • David, 2020. How many languages in the world. [Çevrimiçi]: https://www.ethnologue.com/guides/how-many-languages
  • Dehkharghani, Rahim & Saygin, Yucel & Yanikoglu, Berrin & Oflazer, Kemal. (2015). SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation. 50. 10.1007/s10579-015-9307-6.
  • Eryiğit, G. (2014, April). ITU Turkish NLP web service. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 1-4).
  • Gardner, (2017). A natural language processing platform for building state-of-the-art models. 2020. [Çevrimiçi]: https://allennlp.org/
  • Honnibal, Matthew, (2017). Spacy Industrial-Strength Natural Language Processing in Python, 2020. [Çevrimiçi]: https://spacy.io/
  • Howard, (2017). Fast.ai , 2020. [Çevrimiçi]: https://www.fast.ai/
  • Jivani, A. G. (2011). A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl, 2(6), 1930-1938.
  • Koksal, (2018). GitHub - akoksal/Turkish-Lemmatizer: Lemmatization for Turkish Language, 2018. [Çevrimiçi]: https://github.com/akoksal/Turkish-Lemmatizer
  • Liang, J., Koperski, K., Dhillon, N. S., Tusk, C., & Bhatti, S. (2013). U.S. Patent No. 8,594,996. Washington, DC: U.S. Patent and Trademark Office.
  • Lorai, (2013). TextBlob: Simplified Text Processing, 2020. [Çevrimiçi]: https://textblob.readthedocs.io/en/dev/
  • Majumder, P., Mitra, M., & Chaudhuri, B. B. (2002). N-gram: a language independent approach to IR and NLP. In International conference on universal knowledge and language.
  • McClosky,,Bauer, (2014). Stanford CoreNLP: A Java suite of core NLP tools. 2020. [Çevrimiçi]: https://github.com/stanfordnlp/CoreNLP
  • Onaldi, (2018). Turkish Stemmer, 2019. [Çevrimiçi]: https://github.com/otuncelli/turkish-stemmer-python/
  • Paszke, (2017). Pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration, 2020. [Çevrimiçi]: https://github.com/pytorch/pytorch
  • Radim, Sojka, (2008). Gensim 3.8.3 Python framework for fast Vector Space Modelling, 2020. [Çevrimiçi]: https://pypi.org/project/gensim/
  • Sciforce, (2020). Text Preprocessing for NLP and Machine Learning Tasks [Çevrimiçi]: https://medium.com/sciforce/text-preprocessing-for-nlp-and-machine-learning-tasks-3e077aa4946e
  • Sun, Q., Wang, B., Gu, Z., & Fu, Y. (2018). VECTORIZATION METHODS IN RECOMMENDER SYSTEM.
  • Webster, (1992). Tokenization as the initial phase in NLP. In COLING 1992 Volume 4: The 15th International Conference on Computational Linguistics.
  • Zemberek, (2007). NLP tools for Turkish. [Çevrimiçi]: https://github.com/ahmetaa/zemberek-nlp

Open Source Natural Language Processing Libraries

Year 2021, Volume: 3 Issue: 1, 81 - 85, 30.04.2021
https://doi.org/10.47769/izufbed.879217

Abstract

Natural language processing is a collection of methods in which language components are analyzed both syntactically and semantically. This study presents the set of tools and libraries classified as natural language processing methods. Current and popular libraries used in natural language processing and the methods used in these libraries are comperatively explained.

References

  • Aggarwal, C. C., & Zhai, C. (2012). A survey of text classification algorithms. In Mining text data (pp. 163-222). Springer, Boston, MA.
  • Akbik, Roland, (2018). Flair A very simple framework for state-of-the-art NLP, 2020. [Çevrimiçi]: https://github.com/flairNLP/flair
  • Barrus, (2018). Pyspellchecker Pure python spell checker based on work by Peter Norvig, 2020. [Çevrimiçi]: https://pypi.org/project/pyspellchecker/
  • Bora, (2020). Zemberek Parser , 2019. [Çevrimiçi]: https://github.com/kemalcanbora/zemberek\_parser
  • Buitinck, Louppe, (2013). Scikit-learn 0.23.2 Machine Learning in Python, 2020. [Çevrimiçi]: https://scikit-learn.org/stable/
  • Bird, (2019). Natural Language Toolkit, 2020. [Çevrimiçi]: http://www.nltk.org
  • Cambria, E., Poria, S., Gelbukh, A., & Thelwall, M. (2017). Sentiment analysis is a big suitcase. IEEE Intelligent Systems, 32(6), 74-80.
  • Chowdhary, (2020). Natural language processing. In Fundamentals of Artificial Intelligence (pp. 603-649). Springer, New Delhi.
  • Çetinkaya, (2018). Turkish NLP, 2020. [Çevrimiçi]: https://pypi.org/project/turkishnlp/
  • David, 2020. How many languages in the world. [Çevrimiçi]: https://www.ethnologue.com/guides/how-many-languages
  • Dehkharghani, Rahim & Saygin, Yucel & Yanikoglu, Berrin & Oflazer, Kemal. (2015). SentiTurkNet: a Turkish polarity lexicon for sentiment analysis. Language Resources and Evaluation. 50. 10.1007/s10579-015-9307-6.
  • Eryiğit, G. (2014, April). ITU Turkish NLP web service. In Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics (pp. 1-4).
  • Gardner, (2017). A natural language processing platform for building state-of-the-art models. 2020. [Çevrimiçi]: https://allennlp.org/
  • Honnibal, Matthew, (2017). Spacy Industrial-Strength Natural Language Processing in Python, 2020. [Çevrimiçi]: https://spacy.io/
  • Howard, (2017). Fast.ai , 2020. [Çevrimiçi]: https://www.fast.ai/
  • Jivani, A. G. (2011). A comparative study of stemming algorithms. Int. J. Comp. Tech. Appl, 2(6), 1930-1938.
  • Koksal, (2018). GitHub - akoksal/Turkish-Lemmatizer: Lemmatization for Turkish Language, 2018. [Çevrimiçi]: https://github.com/akoksal/Turkish-Lemmatizer
  • Liang, J., Koperski, K., Dhillon, N. S., Tusk, C., & Bhatti, S. (2013). U.S. Patent No. 8,594,996. Washington, DC: U.S. Patent and Trademark Office.
  • Lorai, (2013). TextBlob: Simplified Text Processing, 2020. [Çevrimiçi]: https://textblob.readthedocs.io/en/dev/
  • Majumder, P., Mitra, M., & Chaudhuri, B. B. (2002). N-gram: a language independent approach to IR and NLP. In International conference on universal knowledge and language.
  • McClosky,,Bauer, (2014). Stanford CoreNLP: A Java suite of core NLP tools. 2020. [Çevrimiçi]: https://github.com/stanfordnlp/CoreNLP
  • Onaldi, (2018). Turkish Stemmer, 2019. [Çevrimiçi]: https://github.com/otuncelli/turkish-stemmer-python/
  • Paszke, (2017). Pytorch/pytorch: Tensors and Dynamic neural networks in Python with strong GPU acceleration, 2020. [Çevrimiçi]: https://github.com/pytorch/pytorch
  • Radim, Sojka, (2008). Gensim 3.8.3 Python framework for fast Vector Space Modelling, 2020. [Çevrimiçi]: https://pypi.org/project/gensim/
  • Sciforce, (2020). Text Preprocessing for NLP and Machine Learning Tasks [Çevrimiçi]: https://medium.com/sciforce/text-preprocessing-for-nlp-and-machine-learning-tasks-3e077aa4946e
  • Sun, Q., Wang, B., Gu, Z., & Fu, Y. (2018). VECTORIZATION METHODS IN RECOMMENDER SYSTEM.
  • Webster, (1992). Tokenization as the initial phase in NLP. In COLING 1992 Volume 4: The 15th International Conference on Computational Linguistics.
  • Zemberek, (2007). NLP tools for Turkish. [Çevrimiçi]: https://github.com/ahmetaa/zemberek-nlp
There are 28 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Havva Yılmaz 0000-0002-0217-099X

Semih Yumuşak 0000-0002-8878-4991

Publication Date April 30, 2021
Submission Date February 14, 2021
Acceptance Date March 23, 2021
Published in Issue Year 2021 Volume: 3 Issue: 1

Cite

APA Yılmaz, H., & Yumuşak, S. (2021). Açık Kaynak Doğal Dil İşleme Kütüphaneleri. İstanbul Sabahattin Zaim Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 3(1), 81-85. https://doi.org/10.47769/izufbed.879217

20503

This work is licensed under Creative Commons Attribution 4.0 International License.