Derleme
BibTex RIS Kaynak Göster

TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ

Yıl 2025, Cilt: 24 Sayı: 48, 686 - 724, 18.12.2025
https://doi.org/10.55071/ticaretfbd.1677269

Öz

Bu çalışma, son beş yılda Türkçe doğal dil işleme alanında gerçekleştirilen gelişmeleri, karşılaşılan metodolojik zorlukları ve geleceğe yönelik araştırma perspektiflerini kapsamlı bir şekilde ele almıştır. Türkçenin eklemeli dil yapısı ve morfolojik zenginliği, NLP alanında dilin yapısal karmaşıklığına uygun özgün yöntemlerin geliştirilmesini gerektirmektedir. Çalışmada, metin sınıflandırma, duygu analizi, soru-cevap sistemleri ve kelime gömme modelleri gibi yaygın NLP uygulamaları değerlendirilmektedir. Özellikle BERT ve GPT gibi transformer tabanlı modellerin Türkçe üzerindeki performansı ve uyarlama çalışmaları detaylandırılmıştır. Türkçe gibi düşük kaynaklı dillerde veri yetersizliğinin NLP modellerinin başarısını kısıtladığı belirtilmiş ve bu sorunun çözümüne yönelik olarak açık kaynak veri kümeleri ile veri artırma tekniklerinin sağladığı katkılar tartışılmıştır. Türkçe için geliştirilen BERTurk, BioBERTurk ve benzeri transformer tabanlı modellerin başarılı sonuçlar vermesine rağmen makine çevirisi, isim tanıma ve metin üretme gibi alanlarda daha fazla çalışmaya ihtiyaç duyulduğu belirtilmiştir. Çalışma, literatürdeki boşluklara işaret ederek Türkçeye özgü veri kaynaklarının ve NLP yöntemlerinin geliştirilmesinin, diğer eklemeli diller için de yol gösterici olabileceğini vurgulamaktadır. Sonuç olarak, bu derleme, Türkçe NLP alanında karşılaşılan mevcut zorlukları ve gelişmeleri ortaya koymakta; düşük kaynaklı dillerde etkin NLP çözümleri üretmeye yönelik öneriler sunmakta ve gelecekte yapılacak araştırmalar için kapsamlı bir yön belirlemektedir.

Etik Beyan

Yapılan çalışmada araştırma ve yayın etiğine uyulmuştur

Kaynakça

  • Acikalin, U. U., Bardak, B., & Kutlu, M. (2020). Turkish sentiment analysis using BERT. 2020 28th Signal Processing and Communications Applications Conference (Siu). https://doi.org/10.1109/siu49456.2020.9302492
  • Acikgoz, E. C., Erdogan, M., & Yuret, D. (2024). Bridging the Bosphorus: Advancing turkish large language models through strategies for low-resource language adaptation and benchmarking. arXiv preprint arXiv:2405.04685.
  • Adali, E., & Adamov, A. Z. (2016). Sentiment analysis for agglutinative languages. 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT), Baku, Azerbaijan, 2016, pp. 1-3, doi: 10.1109/ICAICT.2016.7991659.
  • Ahmetoğlu, H., & Daş, R. (2020). Türkçe otel yorumlarıyla eğitilen kelime vektörü modellerinin duygu analizi ile incelenmesi. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 24(2), 455-463.
  • Akça, O. (2023). Natural language processings in legal domain: Classification of turkish legal texts. [Yüksek Lisans Tezi] Marmara Universitesi.
  • Akça, O., Bayrak, G., Issifu, A. M., & Ganіz, M. C. (2022). Traditional machine learning and deep learning-based text classification for turkish law documents using transformers and domain adaptation, " 2022 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), Biarritz, France, 2022, pp. 1-6, doi: 10.1109/INISTA55318.2022.9894051.
  • Aksu, M. Ç., & Karaman, E. (2020). FastText ve kelime çantası kelime temsil yöntemlerinin turistik mekanlar için yapılan türkçe incelemeler kullanılarak karşılaştırılması. Avrupa Bilim ve Teknoloji Dergisi(20), 311- 320.
  • Al Nahas, A., Kulunk, A., Gozutok, B., Kalkan, S. C., & Erdinc, H. Y. (2020). how to segment turkish words for neural text classification. 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA),
  • Aladağ, F. (2023). Osmanlı çalışmalarında GPT’nin potansiyeli: Evliya Çelebi Seyahatnamesinin NLP ve metin madenciliği ile uygulamalı analizi ve TEI yöntemiyle dijital edisyonu. I. Evliya Çelebi Sempozyumu. İstanbul.
  • Alecakir, H., Bölücü, N., & Can, B. (2022). TurkishDelightNLP: A neural Turkish NLP toolkit, Proceedings of the 2022 Conference of the North American
  • Altinok, D. (2023). A diverse set of freely available linguistic resources for Turkish. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Acl 2023): Long Papers, Vol 1, 13739-13750.
  • Alzoubi, Y. I., Topcu, A. E., & Erkaya, A. E. (2023). Machine learning-based text classification comparison: Turkish language context. Applied Sciences, 13(16), 9428.
  • Aram, K., Erdemir, G., & Can, B. (2024). Formation control of multiple autonomous mobile robots using Turkish natural language processing. Applied Sciences, 14(9), 3722.
  • Aras, A. C., Öztürk, C. E., & Koç, A. (2022). Feedforward neural network based case prediction in Turkish higher courts.
  • Aras, G., Makaroğlu, D., Demir, S., & Cakir, A. (2021). An evaluation of recent neural sequence tagging models in Turkish named entity recognition. Expert Systems with Applications, 182, 115049.
  • Arslan, T. P., & Eryiğit, G. (2023). Incorporating dropped pronouns into coreference resolution: the case for Turkish.
  • Avşaroğlu, M., & Karadağ, A. B. (2019). “Foreign language creation” and “textless back translation”: A case study on Turkish translations of jason goodwin’s ottoman-themed works written in English. Advances in Language and Literary Studies, 10(5), 107-119.
  • Aydemir, E. (2023, 2023). Estimation of Turkish constitutional court decisions in terms of admissibility with NLP.
  • Aydoğan, M., & Karci, A. (2020). Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification. Physica A: Statistical Mechanics and its Applications, 541, 123288.
  • Aydoğan, M., & Kocaman, V. (2023). TRSAv1: a new benchmark dataset for classifying user reviews on Turkish e-commerce websites. Journal of Information Science, 49(6), 1711-1725.
  • Aytan, B., & Sakar, C. O. (2022, 2022). Comparison of transformer-based models trained in Turkish and different languages on Turkish natural language processing problems.
  • Aytan, B., & ŞAkar, C. O. (2023). Deep learning-based Turkish spelling error detection with a multi-class false positive reduction model. Turkish Journal of Electrical Engineering and Computer Sciences, 31(3), 581-595.
  • Ayverdi, S., Öncevarlik, A., Uçar, M., & Adali, E. (2020, 2020). Time and object based question and answering system for Turkish.
  • Ba Alawi, A., & Bozkurt, F. (2024). Performance analysis of embedding methods for deep learning-based Turkish sentiment analysis Models. Arabian Journal for Science and Engineering, 1-23.
  • Bağcı, A., & Amasyali, M. F. (2021, 2021). Comparison of Turkish paraphrase generation models.
  • Balcıoğlu, Y. S. (2024). Detecting Turkish cyberbullying tweets using machine learning. Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 12(3), 1410-1428.
  • Balli, C., Guzel, M. S., Bostanci, E., & Mishra, A. (2022). Sentimental analysis of Twitter users from Turkish content with natural language processing. Computational Intelligence and Neuroscience, 2022(1), 2455160.
  • Barbieri, F., Anke, L. E., & Camacho-Collados, J. (2021). XLM-T: Multilingual language models in Twitter for sentiment analysis and beyond. arXiv preprint arXiv:2104.12250.
  • Baykara, B., & Güngör, T. (2022). Abstractive text summarization and new large-scale datasets for agglutinative languages Turkish and Hungarian. Language Resources and Evaluation, 56(3), 973-1007.
  • Boltayevich, E. B., Adalι, E., Mirdjonovna, K. S., Xolmo'Minovna, A. O., Yuldashevna X. Z., & Uktamboy O'g'li, X. N. (2023). The problem of pos tagging and stemming for agglutinative languages (Turkish, Uyghur, Uzbek Languages). 2023 8th International Conference on Computer Science and Engineering (UBMK)
  • Bozuyla, M. (2024). Sentiment analysis of Turkish drug reviews with bidirectional encoder representations from transformers. ACM Transactions on Asian and Low-Resource Language Information Processing, 23(1), 1-17.
  • Bozuyla, M., & Özçift, A. (2022). Developing a fake news identification model with advanced deep languagetransformers for Turkish COVID-19 misinformation data. Turkish Journal of Electrical Engineering and Computer Sciences, 30(3), 908-926.
  • Bölücü, N., & Can, B. (2019). Unsupervised joint PoS tagging and stemming for agglutinative languages. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 18(3), 1-21.
  • Budur, E., Özçelik, R., Güngör, T., & Potts, C. (2020). Data and representation for Turkish natural language inference. arXiv preprint arXiv:2004.14963.
  • Carik, B., & Yeniterzi, R. (2021, 2021). SU-NLP at CheckThat! 2021: Check- Worthiness of Turkish Tweets.
  • Cavusoglu, I., Pielka, M., & Sifa, R. (2020). Adapting established text representations for predicting review sentiment in Turkish. 2020 Ieee 7th International Conference on Data Science and Advanced Analytics (Dsaa 2020), 755-756. https://doi.org/10.1109/Dsaa49011.2020.00100
  • Conneau, A. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
  • Çam, N. B., & Özgür, A. (2023). Evaluation of chatgpt and bert-based models for turkish hate speech detection. 2023 8th International Conference on Computer Science and Engineering (UBMK) 229-233
  • Çarık, B., & Yeniterzi, R. (2022). A Twitter Corpus for named entity recognition in Turkish. Proceedings of the Thirteenth Language Resources and Evaluation Conference(LREC),4546-4551
  • Çelıkten, A., & Bulut, H. (2021). Turkish medical text classification using BERT. 2021 29th Signal Processing and Communications Applications Conference (SIU) ,1-4
  • Çetindağ, C., Yazıcıoğlu, B., & Koç, A. (2023). Named-entity recognition in Turkish legal texts. Natural Language Engineering, 29(3), 615-642.
  • Çöltekin, Ç. (2014). A set of open source tools for Turkish natural language processing. Proceedings of the Thirteenth Language Resources and Evaluation Conference(LREC) 1079-1086.
  • Çöltekin, Ç., Dogruöz, A. S., & Çetinoglu, Ö. (2023). Resources for Turkish natural language processing. Language Resources and Evaluation, 57(1), 449-488. https://doi.org/10.1007/s10579-022-09605-4
  • Demir, S., & Topcu, B. (2022). Graph-based Turkish text normalization and its impact on noisy text processing. Engineering Science and Technology, an International Journal, 35, 101192.
  • Demirci, G. M., Keskin, Ş. R., & Doğan, G. (2019). Sentiment analysis in Turkish with deep learning. 2019 IEEE International Conference on Big Data (Big Data), 2215-2221.
  • Doğan, B., Balcioglu, Y. S., & Elçi, M. (2024). Multidimensional sentiment analysis method on social media data: comparison of emotions during and after the COVID-19 pandemic. Kybernetes.
  • Dönmez, İ., & Adalı, E. (2015). Türkçe tümce çözümlemede vektör yaklaşımı. Afyon Kocatepe Üniversitesi Fen Ve Mühendislik Bilimleri Dergisi, 15(3), 1-11.
  • Dündar, E. B., Kiliç, O. F., Cekiç, T., Manav, Y., & Deniz, O. (2020, 2020). large scale intent detection in Turkish short sentences with contextual word embeddings. In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020)
  • EmreOztürk, C. (2023). Retrieving turkish prior legal cases with deep learning. [Doktora Tezi]. Bilkent Üniversitesi.
  • Eryiğit, G. (2014). ITU Turkish NLP web service, Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics. 2014. p. 1-4.
  • Feng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., & Hovy, E. (2021). A survey of data augmentation approaches for NLP. arXiv preprint arXiv:2105.03075.
  • Firat, M. (2020). Öğrenci destek servislerinde doğal dil işleme: GPT-3 örneği. International Conference of Strategic Research in Social Science and Education. 2020. p. 532-536.
  • Firoozi, T., Bulut, O., & Gierl, M. (2023). Language models in automated essay scoring: Insights for the Turkish language. International Journal of Assessment Tools in Education, 10(Special Issue), 149-163.
  • Freiling, I. (2019). Detecting misinformation in online social networks: A think-aloud study on user strategies. Scm Studies in Communication and Media, 8(4), 471-496. https://doi.org/10.5771/2192-4007-2019-4-471
  • Gemirter, C. B., & Goularas, D. (2021). A Turkish question answering system based on deep learning neural networks [Derin Öğrenme Sinir Ağlarına Dayalı Türkçe Soru Cevaplama Sistemi]. Journal of Intelligent Systems: Theory and Applications, 4(2), 65-75. https://doi.org/10.38016/jista.815823
  • Girgin, A. B. A., Gümüsçekiççi, G., & Birdemir, N. C. (2024). Turkish sentiment analysis: A comprehensive review. Sigma Journal of Engineering and Natural Sciences-Sigma Muhendislik ve Fen Bilimleri Dergisi, 42(4), 1292-1314. https://doi.org/10.14744/sigma.2024.00033
  • Girgin, A. B. A., & Şahin, S. (2023). Improving the performance of sentiment analysis by ensemble hybrid learning algorithm with nlp and cascaded feature extraction. International Journal of Advances in Engineering and Pure Sciences, 35(1), 125-141.
  • Güler, G., & Tantuğ, A. C. (2020). Comparison of Turkish word representations trained on different morphological forms. arXiv preprint arXiv:2002.05417.
  • Haque, M. R., Lima, S. A., & Mishu, S. Z. (2019). Performance analysis of different neural networks for sentiment analysis on IMDb movie reviews.3rd International conference on electrical, computer & telecommunication engineering (ICECTE). IEEE, 2019. p. 161-164.
  • Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
  • Karagöz, F., Doğan, B., & Özateş, Ş. (2024). Towards a clean text corpus for Ottoman Turkish. Proceedings of the First Workshop on Natural Language Processing for Turkic Languages (SIGTURK 2024). 2024. p. 62-70.
  • Karaoğlan, B., Yorgancioğlu, H. E., Kişla, T., & Metın, S. K. (2019, 2019). The Impact of sentence embeddings in Turkish paraphrase detection. 2019 27th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4).
  • Karayiğit, H., Akdagli, A., & Aci, Ç. İ. (2022). Homophobic and hate speech detection using multilingual-bert model on turkish social media. Information Technology and Control, 51(2), 356-375.
  • Katar, O., Özkan, D., Yıldırım, Ö., & Acharya, U. R. (2023). Evaluation of GPT-3 AI language model in research paper writing. Turkish Journal of Science and Technology, 18(2), 311-318.
  • Kaya, Y. B., & Tantug, A. C. (2022,). Finding the optimal vocabulary size for Turkish named entity recognition. ALTNLP. 2022. p. 99-106.
  • Kaya, Y. B., & Tantuğ, A. C. (2024). BERT2D: Two Dimensional positional embeddings for efficient Turkish NLP. IEEE Access.
  • Kaya, Y. B., & Tantuğ, A. C. (2024). Effect of tokenization granularity for Turkish large language models. Intelligent Systems with Applications, 21, 200335.
  • Kemaloğlu, N., Küçüksille, E., & Özgünsür, M. (2021). Turkish sentiment analysis on social media. Sakarya University Journal of Science, 25(3), 629-638.
  • Kesgin, H. T., Yuce, M. K., & Amasyali, M. F. (2023). Developing and evaluating tiny to medium-sized turkish bert models. arXiv preprint arXiv:2307.14134.
  • Kesgin, H. T., Yuce, M. K., Dogan, E., Uzun, M. E., Uz, A., Seyrek, H. E., Zeer, A., & Amasyali, M. F. (2024). Introducing cosmosGPT: Monolingual training for Turkish language models. arXiv preprint arXiv:2404.17336.
  • Kilimci, Z. H., & Akyokuş, S. (2019, 2019). The evaluation of word embedding models and deep learning algorithms for Turkish text classification.
  • Kirelli, Y., & Arslankaya, S. (2020). Sentiment analysis of shared tweets on global warming on twitter with data mining methods: a case study on Turkish language. Computational Intelligence and Neuroscience, 2020(1), 1904172. Koksal, A. T., Bozal, O., Yürekli, E., & Gezici, G. (2020). TurkishTweets: A benchmark dataset for Turkish text correction. In Findings of the Association for Computational Linguistics: EMNLP 2020.
  • Kontuk, R., & Turan, M. (2020). NLP kullanılarak haberlerin yaş gruplarına göre sınıflandırılması. Gazi University Journal of Science Part C: Design and Technology, 8(2), 372-382.
  • Koru, G. K., & Uluyol, Ç. (2024). Detection of Turkish fake news from tweets with bert models. IEEE Access.
  • Köksal, A., & Özgür, A. (2021). Twitter dataset and evaluation of transformers for Turkish sentiment analysis. 29th Signal Processing and Communications Applications Conference (SIU). IEEE, 2021. p. 1-4.
  • Köksal, Ö., & Yılmaz, E. H. (2022). Improving automated Turkish text classification with learning‐based algorithms. Concurrency and Computation: Practice and Experience, 34(11), e6874.
  • Kuruca, Y., Üstüner, M., & Şimşek, I. (2022). Dijital pazarlamada yapay zekâ kullanımı: Sohbet robotu (Chatbot). Medya ve Kültür, 2(1), 88-113.
  • Kuyumcu, B., Aksakalli, C., & Delil, S. (2019). An automated new approach in fast text classification (fastText) A case study for Turkish text classification without pre-processing. Proceedings of the 2019 3rd international conference on natural language processing and information retrieval. 2019. p. 1-4.
  • Küçük, D., & Can, F. (2019). A tweet dataset annotated for named entity recognition and stance detection. arXiv preprint arXiv:1901.04787.
  • Li, G., Wang, Z., Zhao, M., Song, Y., & Lan, L. (2022). Sentiment analysis of political posts on Hong Kong local forums using fine-tuned mBERT. 2022 IEEE International Conference on Big Data (Big Data), 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022. p. 6763-6765.
  • Marsan, B., Kara, N., Ozçelik, M., Arıcan, B. N., Cesur, N., Kuzgun, A., Sanıyar, E., Kuyrukçu, O., & Yıldız, O. T. (2021). Building the turkish framenet. South African Centre for Digital Language Resources (SADiLaR) Potchefstroom, South Africa, 118.
  • Morwal, S., Jahan, N., & Chopra, D. (2012). Named entity recognition using hidden Markov model (HMM). International Journal on Natural Language Computing (IJNLC) Vol, 1.
  • Muller, B., Gupta, D., Fauconnier, J.-P., Patwardhan, S., Vandyke, D., & Agarwal, S. (2023). Languages you know influence those you learn: Impact of language characteristics on multi-lingual text-to-text transfer. Transfer Learning for Natural Language Processing Workshop. PMLR, 2023. p. 88-102.
  • Mumcuoğlu, E., Öztürk, C. E., Ozaktas, H. M., & Koç, A. (2021). Natural language processing in law: Prediction of outcomes in the higher courts of Turkey. Information Processing & Management, 58(5), 102684.
  • Najafi, A., & Varol, O. (2024). Turkishbertweet: Fast and reliable large language model for social media analysis. Expert Systems with Applications, 255, 124737.
  • Nangia, N., & Bowman, S. R. (2019). Human vs. muppet: A conservative estimate of human performance on the GLUE benchmark. arXiv preprint arXiv:1905.10425.
  • Nasution, A. H., & Onan, A. (2024). ChatGPT label: Comparing the quality of human-generated and LLM-generated annotations in low-resource language NLP tasks. IEEE Access.
  • Nazaretsky, T., Yolcu, H. H., Ariely, M., & Alexandron, G. (2023). Towards automated assessment of scientific explanations in Turkish using language transfer. Proceedings of the 16th International Conference on Educational Data Mining. 2023. p. 453-457.
  • Nezhad, S. B., & Agrawal, A. (2023). mBBC: Exploring the multilingual maze. arXiv preprint arXiv:2310.05404.
  • Okur, H. I., & Sertbaş, A. (2021). Pretrained neural models for turkish text classification. 2021 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 2021. p. 174-179.
  • Onan, A., & Balbal, K. F. (2024). Improving Turkish text sentiment classification through task-specific and universal transformations: an ensemble data augmentation approach. IEEE Access.
  • Ozcelik, O., & Toraman, C. (2022). Named entity recognition in Turkish: A comparative study with detailed error analysis. Information Processing & Management, 59(6), 103065.
  • Ozdemir, A., & Yeniterzi, R. (2020). Su-nlp at semeval-2020 task 12: Offensive language identification in turkish tweets. Proceedings of the Fourteenth Workshop on Semantic Evaluation. 2020. p. 2171-2176.
  • Özateş, Ş. B., Tıraş, T. E., Genç, E. E., & Taşdemir, E. F. B. (2024). Dependency annotation of Ottoman Turkish with multilingual BERT. arXiv preprint arXiv:2402.14743.
  • Özçift, A., Akarsu, K., Yumuk, F., & Söylemez, C. (2021). Advancing natural language processing (NLP) applications of morphologically rich languages with bidirectional encoder representations from transformers (BERT): an empirical case study for Turkish. Automatika, 62(2), 226-238. https://doi.org/10.1080/00051144.2021.1922150
  • Özkan, M., & Kar, G. (2022). Türkçe dilinde yazilan bilimsel metinlerin derin öğrenme tekniği uygulanarak çoklu siniflandirilmasi. Mühendislik Bilimleri ve Tasarım Dergisi, 10(2), 504-519.
  • Öztürk, C. E., Özçelik, S. B., & Koç, A. (2023). A Transformer-based prior legal case retrieval method. 2023 31st Signal Processing and Communications ApplicationsCon.,Siu https://doi.org/10.1109/Siu59756.2023.10223938
  • Panchendrarajan, R., & Amaresan, A. (2018, 2018). Bidirectional LSTM-CRF for named entity recognition.
  • Rajpurkar, P., Jia, R., & Liang, P. (2018). Know what you don't know: Unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822.
  • Ryu, M., & Nakajima, K. (2022). Analysis and mitigation of dataset artifacts in OpenAI GPT-3. In.
  • Safaya, A., Kurtuluş, E., Göktoğan, A., & Yuret, D. (2022). Mukayese: Turkish NLP strikes back. arXiv preprint arXiv:2203.01215.
  • Sanh, V. (2019). DistilBERT, A Distilled Version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
  • Sarıtaş, K., Öz, C. A., & Güngör, T. (2024). A comprehensive analysis of static word embeddings for Turkish. Expert Systems with Applications, 252, 124123.
  • Schweter, S. (2020). Berturk-bert models for Turkish, April 2020. URL https://doi. org/10.5281/zenodo, 3770924.
  • Sert, M. F., Yıldırım, E., & Haşlak, İ. (2022). Using artificial intelligence to predict decisions of the Turkish constitutional court. Social Science Computer Review, 40(6), 1416-1435.
  • Song, B., Li, Z., Lin, X., Wang, J., Wang, T., & Fu, X. (2021). Pretraining model for biological sequence data. Briefings in functional genomics, 20(3), 181-195.
  • Soygazi, F., Çiftçi, O., Kök, U., & Cengiz, S. (2021). THQuAD: Turkish historic question answering dataset for reading comprehension. 2021 6th international conference on computer science and engineering (UBMK). IEEE, 2021. p. 215-220.
  • Srinivasan, A., Sitaram, S., Ganu, T., Dandapat, S., Bali, K., & Choudhury, M. (2021). Predicting the performance of multilingual nlp models. arXiv preprint arXiv:2110.08875.
  • Suncak, A., & Aktaş, Ö. (2021). A novel approach for detecting defective expressions in Turkish. Journal of Artificial Intelligence and Data Science, 1(1), 35-40.
  • Suncak, A., & Aktaş, Ö. (2022). Detecting Defective Expressions in Turkish Sentences Using a Hybrid Deep Learning Method. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi, 24(72), 825-834.
  • Şahin, G. G. (2024). Bridging the Gap Between Wikipedians and Scientists with Terminology-Aware Translation: A Case Study in Turkish. Wikimedia Research Fund 2024
  • Şapcı, A. O. B., Taştan, Ö., & Yeniterzi, R. (2020). Active Learning for Turkish Text Classification. 28th Signal Processing and Communications Applications Conference (SIU). IEEE, 2020. p. 1-4.
  • Tohma, K., & Kutlu, Y. (2020). Challenges Encountered in Turkish Natural Language Processing Studies. Natural and Engineering Sciences, 5(3), 204-211.
  • Tohma, K., Okur, H. I., Kutlu, Y., & Sertbas, A. (2023). Sentiment Analysis in Turkish Question Answering Systems: An Application of Human-Robot Interaction. IEEE Access.
  • Tokcaer, S. (2021). Türkçe metinlerde duygu analizi. Yaşar Üniversitesi E-Dergisi, 16(63), 1514-1534.
  • Toraman, C., Ozcelik, O., Sahinuç, F., & Sahin, U. (2022, 2022). ARC-NLP at CheckThat!-2022: Contradiction for Harmful Tweet Detection. CLEF (Working Notes). 2022. p. 722-739.
  • Touheed, M., Zubair, U., Sabir, D., Hassan, A., Butt, M. F. U., Riaz, F., Abdul, W., & Ayub, R. (2024). Applications of Pruning Methods in Natural Language Processing. IEEE Access.
  • Tulu, C. N. (2022). Experimental comparison of pre-trained word embedding vectors of Word2Vec, glove, FastText for word level semantic text similarity measurement in turkish. Advances in Science and Technology. Research Journal, 16(4).
  • Tunali, V. (2022). Improved prioritization of software development demands in Turkish with deep learning-based NLP. IEEE Access, 10, 40249-40263.
  • Turker, M., Ari, M. E., & Han, A. (2024). VNLP: Turkish NLP Package. arXiv preprint arXiv:2403.01309.
  • Türk, U., Atmaca, F., Özates, S. B., Berk, G., Bedir, S. T., Köksal, A., Basaran, B. Ö., Güngör, T., & Özgür, A. (2022). Resources for Turkish dependency parsing: introducing the BOUN Treebank and the BoAT annotation tool. Language Resources and Evaluation, 56(1), 259-307. https://doi.org/10.1007/s10579-021-09558-0
  • Türkmen, H., Dikenelli, O., Eraslan, C., Callı, M. C., & Özbek, S. S. (2023). BioBERTurk: Exploring Turkish Biomedical Language Model Development Strategies in Low-Resource Setting. Journal of Healthcare Informatics Research, 7(4), 433-446.
  • Türkmen, H., Dikenelli, O., Eraslan, C., Çalli, M. C., & Ozbek, S. S. (2022). Developing Pretrained Language Models for Turkish Biomedical Domain. 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI), IEEE, 2022. 597-598.
  • Türkmen, H., Dikenelli, O., Eraslan, C., Çallı, M. C., & Özbek, S. S. (2023). Harnessing the power of BERT in the Turkish clinical domain: pretraining approaches for limited data scenarios. arXiv preprint arXiv:2305.03788.
  • Uludoğan, G., Balal, Z. Y., Akkurt, F., Türker, M., Güngör, O., & Üsküdarlı, S. (2024). Turna: A turkish encoder-decoder language model for enhanced understanding and generation. arXiv preprint arXiv:2401.14373.
  • Uskudarli, S., Şen, M., Akkurt, F., Gürbüz, M., Güngör, O., Özgür, A., & Güngör, T. (2023). TULAP-An Accessible and Sustainable Platform for Turkish Natural Language Processing Resources. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. 2023. p. 219-227.
  • Uymaz, H. A., & Metin, S. K. (2023a). Emotion-enriched word embeddings for Turkish. Expert Systems with Applications, 225, 120011.
  • Uymaz, H. A., & Metin, S. K. (2023b). Enriching Transformer-Based Embeddings for Emotion Identification in an Agglutinative Language: Turkish. It Professional, 25(4), 67-73.
  • Wikipedia. (2024). Türk dilleri. Wikipedia. Retrieved 18.10.2024 from https://tr.wikipedia.org/wiki/T%C3%BCrk_dilleri
  • Xu, Q. A., Chang, V., & Jayne, C. (2022). A systematic review of social media-based sentiment analysis: Emerging trends and challenges. Decision Analytics Journal, 3, 100073.
  • Xue, L. (2020). mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934.
  • Yao, S., Peng, B., Papadimitriou, C., & Narasimhan, K. (2021). Self-attention networks can process bounded hierarchical languages. arXiv preprint arXiv:2105.11115.
  • Yazar, B. K., Şahın, D. Ö., & Kiliç, E. (2023). Low-resource neural machine translation: A systematic literature review. IEEE Access, 11, 131775-131813.
  • Yıldırım, A., Cetiner, M., Öksüz, C., & Onay, B. (2021). A search tool in Turkish using contextual vectors. 2021 29th Signal Processing and Communications Applications Conference (SIU). IEEE, 2021. p. 1-4.
  • Yıldız, O. T., Avar, B., & Ercan, G. (2019). An open, extendible, and fast Turkish morphological analyzer. Proceedings of Recent Advances in Natural Language Processing, pages 1364–1372 Yilmaz, E. H., & Toraman, C. (2021). Intent classification based on deep learning language model in turkish dialog systems. 2021 29th Signal Processing and Communications Applications Conference (SIU) p 1-4
  • Yucalar, F. (2023). Developing an advanced software requirements classification model using bert: An empirical evaluation study on newly generated turkish data. Applied Sciences, 13(20), 11127.
  • Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., & He, Q. (2020). A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1), 43-76.
  • Zorarpaci, E. (2023). A Turkish Text Classification Based Feature Selection and Density Peaks Clustering. In 2023 31st Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE.
  • Zovikoğlu, M., & Çetin, U. (2024). Detecting misinformation on social networks with NLP. Transactions on Computer Science and Applications, 1(1), 11-16.

THEORETICAL EVALUATION OF TURKISH NATURAL LANGUAGE PROCESSING BASED STUDIES: METHODOLOGICAL CHALLENGES AND FUTURE PERSPECTIVES

Yıl 2025, Cilt: 24 Sayı: 48, 686 - 724, 18.12.2025
https://doi.org/10.55071/ticaretfbd.1677269

Öz

This study comprehensively addresses the developments in the field of Turkish natural language processing over the past five years, the methodological challenges encountered, and future research perspectives. The agglutinative structure and morphological richness of Turkish require the development of unique methods suitable for the structural complexity of the language in the NLP field. The study evaluates common NLP applications such as text classification, sentiment analysis, question-answer systems, and word embedding models. In particular, the performance of transformer- based models like BERT and GPT on Turkish and their adaptation studies are detailed. It is noted that data scarcity in low-resource languages like Turkish limits the success of NLP models, and discussions are provided on the contributions of open-source datasets and data augmentation techniques to address this problem. Despite the successful results of transformer-based models developed for Turkish, such as BERTurk and BioBERTurk, it is stated that further research is needed in areas such as machine translation, named entity recognition, and text generation. The study emphasizes that addressing the gaps in the literature and developing Turkish-specific data resources and NLP methods could also be informative for other agglutinative languages. In conclusion, this review highlights the current challenges and advancements encountered in the field of Turkish NLP and offers suggestions for producing effective NLP solutions in low-resource languages.

Kaynakça

  • Acikalin, U. U., Bardak, B., & Kutlu, M. (2020). Turkish sentiment analysis using BERT. 2020 28th Signal Processing and Communications Applications Conference (Siu). https://doi.org/10.1109/siu49456.2020.9302492
  • Acikgoz, E. C., Erdogan, M., & Yuret, D. (2024). Bridging the Bosphorus: Advancing turkish large language models through strategies for low-resource language adaptation and benchmarking. arXiv preprint arXiv:2405.04685.
  • Adali, E., & Adamov, A. Z. (2016). Sentiment analysis for agglutinative languages. 2016 IEEE 10th International Conference on Application of Information and Communication Technologies (AICT), Baku, Azerbaijan, 2016, pp. 1-3, doi: 10.1109/ICAICT.2016.7991659.
  • Ahmetoğlu, H., & Daş, R. (2020). Türkçe otel yorumlarıyla eğitilen kelime vektörü modellerinin duygu analizi ile incelenmesi. Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, 24(2), 455-463.
  • Akça, O. (2023). Natural language processings in legal domain: Classification of turkish legal texts. [Yüksek Lisans Tezi] Marmara Universitesi.
  • Akça, O., Bayrak, G., Issifu, A. M., & Ganіz, M. C. (2022). Traditional machine learning and deep learning-based text classification for turkish law documents using transformers and domain adaptation, " 2022 International Conference on INnovations in Intelligent SysTems and Applications (INISTA), Biarritz, France, 2022, pp. 1-6, doi: 10.1109/INISTA55318.2022.9894051.
  • Aksu, M. Ç., & Karaman, E. (2020). FastText ve kelime çantası kelime temsil yöntemlerinin turistik mekanlar için yapılan türkçe incelemeler kullanılarak karşılaştırılması. Avrupa Bilim ve Teknoloji Dergisi(20), 311- 320.
  • Al Nahas, A., Kulunk, A., Gozutok, B., Kalkan, S. C., & Erdinc, H. Y. (2020). how to segment turkish words for neural text classification. 2020 International Conference on INnovations in Intelligent SysTems and Applications (INISTA),
  • Aladağ, F. (2023). Osmanlı çalışmalarında GPT’nin potansiyeli: Evliya Çelebi Seyahatnamesinin NLP ve metin madenciliği ile uygulamalı analizi ve TEI yöntemiyle dijital edisyonu. I. Evliya Çelebi Sempozyumu. İstanbul.
  • Alecakir, H., Bölücü, N., & Can, B. (2022). TurkishDelightNLP: A neural Turkish NLP toolkit, Proceedings of the 2022 Conference of the North American
  • Altinok, D. (2023). A diverse set of freely available linguistic resources for Turkish. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Acl 2023): Long Papers, Vol 1, 13739-13750.
  • Alzoubi, Y. I., Topcu, A. E., & Erkaya, A. E. (2023). Machine learning-based text classification comparison: Turkish language context. Applied Sciences, 13(16), 9428.
  • Aram, K., Erdemir, G., & Can, B. (2024). Formation control of multiple autonomous mobile robots using Turkish natural language processing. Applied Sciences, 14(9), 3722.
  • Aras, A. C., Öztürk, C. E., & Koç, A. (2022). Feedforward neural network based case prediction in Turkish higher courts.
  • Aras, G., Makaroğlu, D., Demir, S., & Cakir, A. (2021). An evaluation of recent neural sequence tagging models in Turkish named entity recognition. Expert Systems with Applications, 182, 115049.
  • Arslan, T. P., & Eryiğit, G. (2023). Incorporating dropped pronouns into coreference resolution: the case for Turkish.
  • Avşaroğlu, M., & Karadağ, A. B. (2019). “Foreign language creation” and “textless back translation”: A case study on Turkish translations of jason goodwin’s ottoman-themed works written in English. Advances in Language and Literary Studies, 10(5), 107-119.
  • Aydemir, E. (2023, 2023). Estimation of Turkish constitutional court decisions in terms of admissibility with NLP.
  • Aydoğan, M., & Karci, A. (2020). Improving the accuracy using pre-trained word embeddings on deep neural networks for Turkish text classification. Physica A: Statistical Mechanics and its Applications, 541, 123288.
  • Aydoğan, M., & Kocaman, V. (2023). TRSAv1: a new benchmark dataset for classifying user reviews on Turkish e-commerce websites. Journal of Information Science, 49(6), 1711-1725.
  • Aytan, B., & Sakar, C. O. (2022, 2022). Comparison of transformer-based models trained in Turkish and different languages on Turkish natural language processing problems.
  • Aytan, B., & ŞAkar, C. O. (2023). Deep learning-based Turkish spelling error detection with a multi-class false positive reduction model. Turkish Journal of Electrical Engineering and Computer Sciences, 31(3), 581-595.
  • Ayverdi, S., Öncevarlik, A., Uçar, M., & Adali, E. (2020, 2020). Time and object based question and answering system for Turkish.
  • Ba Alawi, A., & Bozkurt, F. (2024). Performance analysis of embedding methods for deep learning-based Turkish sentiment analysis Models. Arabian Journal for Science and Engineering, 1-23.
  • Bağcı, A., & Amasyali, M. F. (2021, 2021). Comparison of Turkish paraphrase generation models.
  • Balcıoğlu, Y. S. (2024). Detecting Turkish cyberbullying tweets using machine learning. Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 12(3), 1410-1428.
  • Balli, C., Guzel, M. S., Bostanci, E., & Mishra, A. (2022). Sentimental analysis of Twitter users from Turkish content with natural language processing. Computational Intelligence and Neuroscience, 2022(1), 2455160.
  • Barbieri, F., Anke, L. E., & Camacho-Collados, J. (2021). XLM-T: Multilingual language models in Twitter for sentiment analysis and beyond. arXiv preprint arXiv:2104.12250.
  • Baykara, B., & Güngör, T. (2022). Abstractive text summarization and new large-scale datasets for agglutinative languages Turkish and Hungarian. Language Resources and Evaluation, 56(3), 973-1007.
  • Boltayevich, E. B., Adalι, E., Mirdjonovna, K. S., Xolmo'Minovna, A. O., Yuldashevna X. Z., & Uktamboy O'g'li, X. N. (2023). The problem of pos tagging and stemming for agglutinative languages (Turkish, Uyghur, Uzbek Languages). 2023 8th International Conference on Computer Science and Engineering (UBMK)
  • Bozuyla, M. (2024). Sentiment analysis of Turkish drug reviews with bidirectional encoder representations from transformers. ACM Transactions on Asian and Low-Resource Language Information Processing, 23(1), 1-17.
  • Bozuyla, M., & Özçift, A. (2022). Developing a fake news identification model with advanced deep languagetransformers for Turkish COVID-19 misinformation data. Turkish Journal of Electrical Engineering and Computer Sciences, 30(3), 908-926.
  • Bölücü, N., & Can, B. (2019). Unsupervised joint PoS tagging and stemming for agglutinative languages. ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP), 18(3), 1-21.
  • Budur, E., Özçelik, R., Güngör, T., & Potts, C. (2020). Data and representation for Turkish natural language inference. arXiv preprint arXiv:2004.14963.
  • Carik, B., & Yeniterzi, R. (2021, 2021). SU-NLP at CheckThat! 2021: Check- Worthiness of Turkish Tweets.
  • Cavusoglu, I., Pielka, M., & Sifa, R. (2020). Adapting established text representations for predicting review sentiment in Turkish. 2020 Ieee 7th International Conference on Data Science and Advanced Analytics (Dsaa 2020), 755-756. https://doi.org/10.1109/Dsaa49011.2020.00100
  • Conneau, A. (2019). Unsupervised cross-lingual representation learning at scale. arXiv preprint arXiv:1911.02116.
  • Çam, N. B., & Özgür, A. (2023). Evaluation of chatgpt and bert-based models for turkish hate speech detection. 2023 8th International Conference on Computer Science and Engineering (UBMK) 229-233
  • Çarık, B., & Yeniterzi, R. (2022). A Twitter Corpus for named entity recognition in Turkish. Proceedings of the Thirteenth Language Resources and Evaluation Conference(LREC),4546-4551
  • Çelıkten, A., & Bulut, H. (2021). Turkish medical text classification using BERT. 2021 29th Signal Processing and Communications Applications Conference (SIU) ,1-4
  • Çetindağ, C., Yazıcıoğlu, B., & Koç, A. (2023). Named-entity recognition in Turkish legal texts. Natural Language Engineering, 29(3), 615-642.
  • Çöltekin, Ç. (2014). A set of open source tools for Turkish natural language processing. Proceedings of the Thirteenth Language Resources and Evaluation Conference(LREC) 1079-1086.
  • Çöltekin, Ç., Dogruöz, A. S., & Çetinoglu, Ö. (2023). Resources for Turkish natural language processing. Language Resources and Evaluation, 57(1), 449-488. https://doi.org/10.1007/s10579-022-09605-4
  • Demir, S., & Topcu, B. (2022). Graph-based Turkish text normalization and its impact on noisy text processing. Engineering Science and Technology, an International Journal, 35, 101192.
  • Demirci, G. M., Keskin, Ş. R., & Doğan, G. (2019). Sentiment analysis in Turkish with deep learning. 2019 IEEE International Conference on Big Data (Big Data), 2215-2221.
  • Doğan, B., Balcioglu, Y. S., & Elçi, M. (2024). Multidimensional sentiment analysis method on social media data: comparison of emotions during and after the COVID-19 pandemic. Kybernetes.
  • Dönmez, İ., & Adalı, E. (2015). Türkçe tümce çözümlemede vektör yaklaşımı. Afyon Kocatepe Üniversitesi Fen Ve Mühendislik Bilimleri Dergisi, 15(3), 1-11.
  • Dündar, E. B., Kiliç, O. F., Cekiç, T., Manav, Y., & Deniz, O. (2020, 2020). large scale intent detection in Turkish short sentences with contextual word embeddings. In Proceedings of the 12th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2020)
  • EmreOztürk, C. (2023). Retrieving turkish prior legal cases with deep learning. [Doktora Tezi]. Bilkent Üniversitesi.
  • Eryiğit, G. (2014). ITU Turkish NLP web service, Proceedings of the Demonstrations at the 14th Conference of the European Chapter of the Association for Computational Linguistics. 2014. p. 1-4.
  • Feng, S. Y., Gangal, V., Wei, J., Chandar, S., Vosoughi, S., Mitamura, T., & Hovy, E. (2021). A survey of data augmentation approaches for NLP. arXiv preprint arXiv:2105.03075.
  • Firat, M. (2020). Öğrenci destek servislerinde doğal dil işleme: GPT-3 örneği. International Conference of Strategic Research in Social Science and Education. 2020. p. 532-536.
  • Firoozi, T., Bulut, O., & Gierl, M. (2023). Language models in automated essay scoring: Insights for the Turkish language. International Journal of Assessment Tools in Education, 10(Special Issue), 149-163.
  • Freiling, I. (2019). Detecting misinformation in online social networks: A think-aloud study on user strategies. Scm Studies in Communication and Media, 8(4), 471-496. https://doi.org/10.5771/2192-4007-2019-4-471
  • Gemirter, C. B., & Goularas, D. (2021). A Turkish question answering system based on deep learning neural networks [Derin Öğrenme Sinir Ağlarına Dayalı Türkçe Soru Cevaplama Sistemi]. Journal of Intelligent Systems: Theory and Applications, 4(2), 65-75. https://doi.org/10.38016/jista.815823
  • Girgin, A. B. A., Gümüsçekiççi, G., & Birdemir, N. C. (2024). Turkish sentiment analysis: A comprehensive review. Sigma Journal of Engineering and Natural Sciences-Sigma Muhendislik ve Fen Bilimleri Dergisi, 42(4), 1292-1314. https://doi.org/10.14744/sigma.2024.00033
  • Girgin, A. B. A., & Şahin, S. (2023). Improving the performance of sentiment analysis by ensemble hybrid learning algorithm with nlp and cascaded feature extraction. International Journal of Advances in Engineering and Pure Sciences, 35(1), 125-141.
  • Güler, G., & Tantuğ, A. C. (2020). Comparison of Turkish word representations trained on different morphological forms. arXiv preprint arXiv:2002.05417.
  • Haque, M. R., Lima, S. A., & Mishu, S. Z. (2019). Performance analysis of different neural networks for sentiment analysis on IMDb movie reviews.3rd International conference on electrical, computer & telecommunication engineering (ICECTE). IEEE, 2019. p. 161-164.
  • Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of tricks for efficient text classification. arXiv preprint arXiv:1607.01759.
  • Karagöz, F., Doğan, B., & Özateş, Ş. (2024). Towards a clean text corpus for Ottoman Turkish. Proceedings of the First Workshop on Natural Language Processing for Turkic Languages (SIGTURK 2024). 2024. p. 62-70.
  • Karaoğlan, B., Yorgancioğlu, H. E., Kişla, T., & Metın, S. K. (2019, 2019). The Impact of sentence embeddings in Turkish paraphrase detection. 2019 27th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4).
  • Karayiğit, H., Akdagli, A., & Aci, Ç. İ. (2022). Homophobic and hate speech detection using multilingual-bert model on turkish social media. Information Technology and Control, 51(2), 356-375.
  • Katar, O., Özkan, D., Yıldırım, Ö., & Acharya, U. R. (2023). Evaluation of GPT-3 AI language model in research paper writing. Turkish Journal of Science and Technology, 18(2), 311-318.
  • Kaya, Y. B., & Tantug, A. C. (2022,). Finding the optimal vocabulary size for Turkish named entity recognition. ALTNLP. 2022. p. 99-106.
  • Kaya, Y. B., & Tantuğ, A. C. (2024). BERT2D: Two Dimensional positional embeddings for efficient Turkish NLP. IEEE Access.
  • Kaya, Y. B., & Tantuğ, A. C. (2024). Effect of tokenization granularity for Turkish large language models. Intelligent Systems with Applications, 21, 200335.
  • Kemaloğlu, N., Küçüksille, E., & Özgünsür, M. (2021). Turkish sentiment analysis on social media. Sakarya University Journal of Science, 25(3), 629-638.
  • Kesgin, H. T., Yuce, M. K., & Amasyali, M. F. (2023). Developing and evaluating tiny to medium-sized turkish bert models. arXiv preprint arXiv:2307.14134.
  • Kesgin, H. T., Yuce, M. K., Dogan, E., Uzun, M. E., Uz, A., Seyrek, H. E., Zeer, A., & Amasyali, M. F. (2024). Introducing cosmosGPT: Monolingual training for Turkish language models. arXiv preprint arXiv:2404.17336.
  • Kilimci, Z. H., & Akyokuş, S. (2019, 2019). The evaluation of word embedding models and deep learning algorithms for Turkish text classification.
  • Kirelli, Y., & Arslankaya, S. (2020). Sentiment analysis of shared tweets on global warming on twitter with data mining methods: a case study on Turkish language. Computational Intelligence and Neuroscience, 2020(1), 1904172. Koksal, A. T., Bozal, O., Yürekli, E., & Gezici, G. (2020). TurkishTweets: A benchmark dataset for Turkish text correction. In Findings of the Association for Computational Linguistics: EMNLP 2020.
  • Kontuk, R., & Turan, M. (2020). NLP kullanılarak haberlerin yaş gruplarına göre sınıflandırılması. Gazi University Journal of Science Part C: Design and Technology, 8(2), 372-382.
  • Koru, G. K., & Uluyol, Ç. (2024). Detection of Turkish fake news from tweets with bert models. IEEE Access.
  • Köksal, A., & Özgür, A. (2021). Twitter dataset and evaluation of transformers for Turkish sentiment analysis. 29th Signal Processing and Communications Applications Conference (SIU). IEEE, 2021. p. 1-4.
  • Köksal, Ö., & Yılmaz, E. H. (2022). Improving automated Turkish text classification with learning‐based algorithms. Concurrency and Computation: Practice and Experience, 34(11), e6874.
  • Kuruca, Y., Üstüner, M., & Şimşek, I. (2022). Dijital pazarlamada yapay zekâ kullanımı: Sohbet robotu (Chatbot). Medya ve Kültür, 2(1), 88-113.
  • Kuyumcu, B., Aksakalli, C., & Delil, S. (2019). An automated new approach in fast text classification (fastText) A case study for Turkish text classification without pre-processing. Proceedings of the 2019 3rd international conference on natural language processing and information retrieval. 2019. p. 1-4.
  • Küçük, D., & Can, F. (2019). A tweet dataset annotated for named entity recognition and stance detection. arXiv preprint arXiv:1901.04787.
  • Li, G., Wang, Z., Zhao, M., Song, Y., & Lan, L. (2022). Sentiment analysis of political posts on Hong Kong local forums using fine-tuned mBERT. 2022 IEEE International Conference on Big Data (Big Data), 2022 IEEE International Conference on Big Data (Big Data). IEEE, 2022. p. 6763-6765.
  • Marsan, B., Kara, N., Ozçelik, M., Arıcan, B. N., Cesur, N., Kuzgun, A., Sanıyar, E., Kuyrukçu, O., & Yıldız, O. T. (2021). Building the turkish framenet. South African Centre for Digital Language Resources (SADiLaR) Potchefstroom, South Africa, 118.
  • Morwal, S., Jahan, N., & Chopra, D. (2012). Named entity recognition using hidden Markov model (HMM). International Journal on Natural Language Computing (IJNLC) Vol, 1.
  • Muller, B., Gupta, D., Fauconnier, J.-P., Patwardhan, S., Vandyke, D., & Agarwal, S. (2023). Languages you know influence those you learn: Impact of language characteristics on multi-lingual text-to-text transfer. Transfer Learning for Natural Language Processing Workshop. PMLR, 2023. p. 88-102.
  • Mumcuoğlu, E., Öztürk, C. E., Ozaktas, H. M., & Koç, A. (2021). Natural language processing in law: Prediction of outcomes in the higher courts of Turkey. Information Processing & Management, 58(5), 102684.
  • Najafi, A., & Varol, O. (2024). Turkishbertweet: Fast and reliable large language model for social media analysis. Expert Systems with Applications, 255, 124737.
  • Nangia, N., & Bowman, S. R. (2019). Human vs. muppet: A conservative estimate of human performance on the GLUE benchmark. arXiv preprint arXiv:1905.10425.
  • Nasution, A. H., & Onan, A. (2024). ChatGPT label: Comparing the quality of human-generated and LLM-generated annotations in low-resource language NLP tasks. IEEE Access.
  • Nazaretsky, T., Yolcu, H. H., Ariely, M., & Alexandron, G. (2023). Towards automated assessment of scientific explanations in Turkish using language transfer. Proceedings of the 16th International Conference on Educational Data Mining. 2023. p. 453-457.
  • Nezhad, S. B., & Agrawal, A. (2023). mBBC: Exploring the multilingual maze. arXiv preprint arXiv:2310.05404.
  • Okur, H. I., & Sertbaş, A. (2021). Pretrained neural models for turkish text classification. 2021 6th International Conference on Computer Science and Engineering (UBMK). IEEE, 2021. p. 174-179.
  • Onan, A., & Balbal, K. F. (2024). Improving Turkish text sentiment classification through task-specific and universal transformations: an ensemble data augmentation approach. IEEE Access.
  • Ozcelik, O., & Toraman, C. (2022). Named entity recognition in Turkish: A comparative study with detailed error analysis. Information Processing & Management, 59(6), 103065.
  • Ozdemir, A., & Yeniterzi, R. (2020). Su-nlp at semeval-2020 task 12: Offensive language identification in turkish tweets. Proceedings of the Fourteenth Workshop on Semantic Evaluation. 2020. p. 2171-2176.
  • Özateş, Ş. B., Tıraş, T. E., Genç, E. E., & Taşdemir, E. F. B. (2024). Dependency annotation of Ottoman Turkish with multilingual BERT. arXiv preprint arXiv:2402.14743.
  • Özçift, A., Akarsu, K., Yumuk, F., & Söylemez, C. (2021). Advancing natural language processing (NLP) applications of morphologically rich languages with bidirectional encoder representations from transformers (BERT): an empirical case study for Turkish. Automatika, 62(2), 226-238. https://doi.org/10.1080/00051144.2021.1922150
  • Özkan, M., & Kar, G. (2022). Türkçe dilinde yazilan bilimsel metinlerin derin öğrenme tekniği uygulanarak çoklu siniflandirilmasi. Mühendislik Bilimleri ve Tasarım Dergisi, 10(2), 504-519.
  • Öztürk, C. E., Özçelik, S. B., & Koç, A. (2023). A Transformer-based prior legal case retrieval method. 2023 31st Signal Processing and Communications ApplicationsCon.,Siu https://doi.org/10.1109/Siu59756.2023.10223938
  • Panchendrarajan, R., & Amaresan, A. (2018, 2018). Bidirectional LSTM-CRF for named entity recognition.
  • Rajpurkar, P., Jia, R., & Liang, P. (2018). Know what you don't know: Unanswerable questions for SQuAD. arXiv preprint arXiv:1806.03822.
  • Ryu, M., & Nakajima, K. (2022). Analysis and mitigation of dataset artifacts in OpenAI GPT-3. In.
  • Safaya, A., Kurtuluş, E., Göktoğan, A., & Yuret, D. (2022). Mukayese: Turkish NLP strikes back. arXiv preprint arXiv:2203.01215.
  • Sanh, V. (2019). DistilBERT, A Distilled Version of BERT: Smaller, faster, cheaper and lighter. arXiv preprint arXiv:1910.01108.
  • Sarıtaş, K., Öz, C. A., & Güngör, T. (2024). A comprehensive analysis of static word embeddings for Turkish. Expert Systems with Applications, 252, 124123.
  • Schweter, S. (2020). Berturk-bert models for Turkish, April 2020. URL https://doi. org/10.5281/zenodo, 3770924.
  • Sert, M. F., Yıldırım, E., & Haşlak, İ. (2022). Using artificial intelligence to predict decisions of the Turkish constitutional court. Social Science Computer Review, 40(6), 1416-1435.
  • Song, B., Li, Z., Lin, X., Wang, J., Wang, T., & Fu, X. (2021). Pretraining model for biological sequence data. Briefings in functional genomics, 20(3), 181-195.
  • Soygazi, F., Çiftçi, O., Kök, U., & Cengiz, S. (2021). THQuAD: Turkish historic question answering dataset for reading comprehension. 2021 6th international conference on computer science and engineering (UBMK). IEEE, 2021. p. 215-220.
  • Srinivasan, A., Sitaram, S., Ganu, T., Dandapat, S., Bali, K., & Choudhury, M. (2021). Predicting the performance of multilingual nlp models. arXiv preprint arXiv:2110.08875.
  • Suncak, A., & Aktaş, Ö. (2021). A novel approach for detecting defective expressions in Turkish. Journal of Artificial Intelligence and Data Science, 1(1), 35-40.
  • Suncak, A., & Aktaş, Ö. (2022). Detecting Defective Expressions in Turkish Sentences Using a Hybrid Deep Learning Method. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi, 24(72), 825-834.
  • Şahin, G. G. (2024). Bridging the Gap Between Wikipedians and Scientists with Terminology-Aware Translation: A Case Study in Turkish. Wikimedia Research Fund 2024
  • Şapcı, A. O. B., Taştan, Ö., & Yeniterzi, R. (2020). Active Learning for Turkish Text Classification. 28th Signal Processing and Communications Applications Conference (SIU). IEEE, 2020. p. 1-4.
  • Tohma, K., & Kutlu, Y. (2020). Challenges Encountered in Turkish Natural Language Processing Studies. Natural and Engineering Sciences, 5(3), 204-211.
  • Tohma, K., Okur, H. I., Kutlu, Y., & Sertbas, A. (2023). Sentiment Analysis in Turkish Question Answering Systems: An Application of Human-Robot Interaction. IEEE Access.
  • Tokcaer, S. (2021). Türkçe metinlerde duygu analizi. Yaşar Üniversitesi E-Dergisi, 16(63), 1514-1534.
  • Toraman, C., Ozcelik, O., Sahinuç, F., & Sahin, U. (2022, 2022). ARC-NLP at CheckThat!-2022: Contradiction for Harmful Tweet Detection. CLEF (Working Notes). 2022. p. 722-739.
  • Touheed, M., Zubair, U., Sabir, D., Hassan, A., Butt, M. F. U., Riaz, F., Abdul, W., & Ayub, R. (2024). Applications of Pruning Methods in Natural Language Processing. IEEE Access.
  • Tulu, C. N. (2022). Experimental comparison of pre-trained word embedding vectors of Word2Vec, glove, FastText for word level semantic text similarity measurement in turkish. Advances in Science and Technology. Research Journal, 16(4).
  • Tunali, V. (2022). Improved prioritization of software development demands in Turkish with deep learning-based NLP. IEEE Access, 10, 40249-40263.
  • Turker, M., Ari, M. E., & Han, A. (2024). VNLP: Turkish NLP Package. arXiv preprint arXiv:2403.01309.
  • Türk, U., Atmaca, F., Özates, S. B., Berk, G., Bedir, S. T., Köksal, A., Basaran, B. Ö., Güngör, T., & Özgür, A. (2022). Resources for Turkish dependency parsing: introducing the BOUN Treebank and the BoAT annotation tool. Language Resources and Evaluation, 56(1), 259-307. https://doi.org/10.1007/s10579-021-09558-0
  • Türkmen, H., Dikenelli, O., Eraslan, C., Callı, M. C., & Özbek, S. S. (2023). BioBERTurk: Exploring Turkish Biomedical Language Model Development Strategies in Low-Resource Setting. Journal of Healthcare Informatics Research, 7(4), 433-446.
  • Türkmen, H., Dikenelli, O., Eraslan, C., Çalli, M. C., & Ozbek, S. S. (2022). Developing Pretrained Language Models for Turkish Biomedical Domain. 2022 IEEE 10th International Conference on Healthcare Informatics (ICHI), IEEE, 2022. 597-598.
  • Türkmen, H., Dikenelli, O., Eraslan, C., Çallı, M. C., & Özbek, S. S. (2023). Harnessing the power of BERT in the Turkish clinical domain: pretraining approaches for limited data scenarios. arXiv preprint arXiv:2305.03788.
  • Uludoğan, G., Balal, Z. Y., Akkurt, F., Türker, M., Güngör, O., & Üsküdarlı, S. (2024). Turna: A turkish encoder-decoder language model for enhanced understanding and generation. arXiv preprint arXiv:2401.14373.
  • Uskudarli, S., Şen, M., Akkurt, F., Gürbüz, M., Güngör, O., Özgür, A., & Güngör, T. (2023). TULAP-An Accessible and Sustainable Platform for Turkish Natural Language Processing Resources. Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations. 2023. p. 219-227.
  • Uymaz, H. A., & Metin, S. K. (2023a). Emotion-enriched word embeddings for Turkish. Expert Systems with Applications, 225, 120011.
  • Uymaz, H. A., & Metin, S. K. (2023b). Enriching Transformer-Based Embeddings for Emotion Identification in an Agglutinative Language: Turkish. It Professional, 25(4), 67-73.
  • Wikipedia. (2024). Türk dilleri. Wikipedia. Retrieved 18.10.2024 from https://tr.wikipedia.org/wiki/T%C3%BCrk_dilleri
  • Xu, Q. A., Chang, V., & Jayne, C. (2022). A systematic review of social media-based sentiment analysis: Emerging trends and challenges. Decision Analytics Journal, 3, 100073.
  • Xue, L. (2020). mt5: A massively multilingual pre-trained text-to-text transformer. arXiv preprint arXiv:2010.11934.
  • Yao, S., Peng, B., Papadimitriou, C., & Narasimhan, K. (2021). Self-attention networks can process bounded hierarchical languages. arXiv preprint arXiv:2105.11115.
  • Yazar, B. K., Şahın, D. Ö., & Kiliç, E. (2023). Low-resource neural machine translation: A systematic literature review. IEEE Access, 11, 131775-131813.
  • Yıldırım, A., Cetiner, M., Öksüz, C., & Onay, B. (2021). A search tool in Turkish using contextual vectors. 2021 29th Signal Processing and Communications Applications Conference (SIU). IEEE, 2021. p. 1-4.
  • Yıldız, O. T., Avar, B., & Ercan, G. (2019). An open, extendible, and fast Turkish morphological analyzer. Proceedings of Recent Advances in Natural Language Processing, pages 1364–1372 Yilmaz, E. H., & Toraman, C. (2021). Intent classification based on deep learning language model in turkish dialog systems. 2021 29th Signal Processing and Communications Applications Conference (SIU) p 1-4
  • Yucalar, F. (2023). Developing an advanced software requirements classification model using bert: An empirical evaluation study on newly generated turkish data. Applied Sciences, 13(20), 11127.
  • Zhuang, F., Qi, Z., Duan, K., Xi, D., Zhu, Y., Zhu, H., Xiong, H., & He, Q. (2020). A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1), 43-76.
  • Zorarpaci, E. (2023). A Turkish Text Classification Based Feature Selection and Density Peaks Clustering. In 2023 31st Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE.
  • Zovikoğlu, M., & Çetin, U. (2024). Detecting misinformation on social networks with NLP. Transactions on Computer Science and Applications, 1(1), 11-16.
Toplam 139 adet kaynakça vardır.

Ayrıntılar

Birincil Dil Türkçe
Konular Doğal Dil İşleme
Bölüm Derleme
Yazarlar

Zülfü Alanoğlu 0000-0001-9710-5658

Gönderilme Tarihi 16 Nisan 2025
Kabul Tarihi 2 Eylül 2025
Erken Görünüm Tarihi 9 Aralık 2025
Yayımlanma Tarihi 18 Aralık 2025
Yayımlandığı Sayı Yıl 2025 Cilt: 24 Sayı: 48

Kaynak Göster

APA Alanoğlu, Z. (2025). TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi, 24(48), 686-724. https://doi.org/10.55071/ticaretfbd.1677269
AMA Alanoğlu Z. TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi. Aralık 2025;24(48):686-724. doi:10.55071/ticaretfbd.1677269
Chicago Alanoğlu, Zülfü. “TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ”. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi 24, sy. 48 (Aralık 2025): 686-724. https://doi.org/10.55071/ticaretfbd.1677269.
EndNote Alanoğlu Z (01 Aralık 2025) TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi 24 48 686–724.
IEEE Z. Alanoğlu, “TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ”, İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi, c. 24, sy. 48, ss. 686–724, 2025, doi: 10.55071/ticaretfbd.1677269.
ISNAD Alanoğlu, Zülfü. “TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ”. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi 24/48 (Aralık2025), 686-724. https://doi.org/10.55071/ticaretfbd.1677269.
JAMA Alanoğlu Z. TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi. 2025;24:686–724.
MLA Alanoğlu, Zülfü. “TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ”. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi, c. 24, sy. 48, 2025, ss. 686-24, doi:10.55071/ticaretfbd.1677269.
Vancouver Alanoğlu Z. TÜRKÇE DOĞAL DİL İŞLEME TEMELLİ ÇALIŞMALARIN TEORİK DEĞERLENDİRMESİ: YÖNTEMSEL ZORLUKLAR VE GELECEK PERSPEKTİFLERİ. İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi. 2025;24(48):686-724.