Doğal Dil İşlemede Yerel Dil Zenginliği: Azbert ile Azerbaycan Türkçesi Metinlerinin Anlamsal Analizi

Orkhan Valiev; Gülsüm Kayabaşı Koru

Research Article

Doğal Dil İşlemede Yerel Dil Zenginliği: Azbert ile Azerbaycan Türkçesi Metinlerinin Anlamsal Analizi

Year 2025, Volume: 1 Issue: 2, 27 - 57

Abstract

Azerbaycan Türkçesinde metin sınıflandırma ve duygu analizi görevlerini yerine getirmek amacıyla tasarlanan BERT tabanlı modelin amacı, metinleri olumlu, olumsuz veya tarafsız olarak sınıflandırmaktır. Dönüştürücü mimarisine dayanan modelin eğitimi, daha önce eğitilmiş olan BERT modeli üzerinde ince ayar yapılarak gerçekleştirilir. Bu eğitimde dilin morfolojik zenginliği ve yerel ifadeleri dikkate alınmaktadır. Modelin etkinliğini değerlendirmek için doğruluk, kesinlik, geri çağırma ve F1 puanı gibi performans ölçütleri kullanılmıştır. Değerlendirme sonuçları, modelin %96 doğruluk ve %95 F1 skoru elde ettiğini ortaya koymuştur. Özellikle pozitif ve negatif sınıflarda yüksek bir doğruluk seviyesine ulaşılmıştır; ancak yerel dil ve yapısal unsurların kullanımı nedeniyle bazı sınıflandırma hataları meydana gelmiştir. Morfolojik ve semantik dil unsurlarının dikkate alınmasının önemli bir husus olduğunu gösteren bulgular, duygu analizine önemli bir katkı sağlamaktadır. Daha büyük veri kümeleri ve yerel dile özgü modellerle yapılacak daha fazla araştırmanın performansı daha da artırma olasılığı yüksektir.

Keywords

Azerbaycan Türkçesi , Duygu Analizi , Metin Sınıflandırma , BERT , Doğal Dil İşleme

References

Abdullayeva, Z., & Abbasova, M. (2022). Azərbaycanca metinlərdə duyğu analizi üçün transformer-tabanlı yanaşmaların qiymətləndirilməsi. Qafqaz Nəqliyyat və Texnologiyalar Jurnalı, 5(1), 27–38.
Agarwal, A., & Mittal, S. (2018). Sentiment analysis: A survey. International Journal of Computer Applications, 179(31), 1–5.
Akbarov, M., & Guliyev, A. (2023). Multilingual BERT modeli ilə Azərbaycan türkcəsində duyğu sınıflandırması. Azərbaycan Süni İntellekt və Dillər Araşdırmaları Jurnalı, 4(2), 74–89.
Akbik, A., Bergmann, T., & Vollgraf, R. (2019). FLAIR: An easy-to-use framework for state-of-the-art NLP. In Proceedings of NAACL-HLT 2019 (pp. 54–59).
Bojanowski, P., Grave, E., Mikolov, T., & Joulin, A. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.
Çöltekin, Ç. (2020). A corpus-based study of Turkish language resources for sentiment analysis. Language Resources and Evaluation, 54(4), 1123–1148. https://doi.org/10.1007/s10579-019-09481-5
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT 2019 (pp. 4171–4186).
Gültekin, A., & Özkan, M. (2019). Comparison of word embedding methods for Turkish text classification. In Proceedings of the International Conference on Natural Language Processing and Information Retrieval (pp. 85–92).
Hakkani-Tür, D., & Tür, G. (2017). Text classification using word embeddings for social media analysis. In Proceedings of the 2017 International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction (pp. 150–159).
Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. In Proceedings of ACL 2018 (pp. 328–339).
Huseynov, R., & Karimov, I. (2019). Azerbaycan dilində duyğu analizi üçün yeni yanaşmalar. Azerbaycan Dil Bilimi Dergisi, 25(6), 88–102.
Isayev, S. (2021). Azerbaycan dilində duyğu analizi və dərin öyrənmə. Azerbaycan Bilim ve Teknoloji Dergisi, 18(4), 124–138.
Khanna, S., & Varma, V. (2020). Sentiment analysis for low resource languages: A case study of Punjabi. In Proceedings of the 2020 International Conference on Natural Language Processing (pp. 90–99).
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
Mammadov, E., & Aliyev, H. (2020). Azerbaycan dilində metin sınıflandırması: BERT modeli ilə təcrübələr. Azerbaycan İnformatika Jurnalı, 10(3), 55–68.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv. https://arxiv.org/abs/1301.3781
Özyurt, O. (2022). Sentiment classification using hybrid transformer-based models for Turkish language. Turkish Journal of Electrical Engineering & Computer Sciences, 30(2), 1401–1416. https://doi.org/10.55730/1300-0632.3878
Quliyev, N. (2019). Azerbaycan dilində duyğu analizi üzrə tədqiqatlar. Bakı: Bakı Dövlət Universiteti Yayınları.
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI Blog. https://openai.com/research/language-unsupervised
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842–866. https://doi.org/10.1162/tacl_a_00349
Sennrich, R., Haddow, B., & Birch, A. (2016). Neural machine translation of rare words with subword units. In Proceedings of ACL 2016 (pp. 1715–1725).
Şahin, C., & Diri, B. (2021). A review on Turkish natural language processing tools and resources. Turkish Journal of Computer and Mathematics Education, 12(2), 131–149.
TensorFlow. (2025). TensorFlow. https://www.tensorflow.org
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Proceedings of NeurIPS 2017 (pp. 5998–6008).
Yılmaz, K., & Çetin, M. (2020). Sentiment analysis for Azerbaijani text using BERT model. In Proceedings of the 2020 International Conference on Artificial Intelligence and Data Engineering (pp. 65–72).
Zhang, L., & Wallace, B. C. (2015). A sensitivity analysis of (and practitioner’s guide to) convolutional neural networks for sentence classification. In Proceedings of EMNLP 2015 (pp. 1431–1440).

Natural Language Processing in Azerbaijani: Text Analysis And Artificial Intelligence Applications

Year 2025, Volume: 1 Issue: 2, 27 - 57

Orkhan Valiev , Gülsüm Kayabaşı Koru

Abstract

Classifying texts as either positive, negative, or neutral is the goal of the BERT-based model that was designed for the purpose of performing text classification and sentiment analysis tasks in Azerbaijani Turkish. The training of the model, which is based on transformer architecture, is accomplished by performing fine-tuning on the BERT model that has already been trained. This training takes into consideration the morphological richness and local expressions of the language. Performance metrics such as accuracy, precision, recall, and F1 score were used in order to assess the effectiveness of the model. The results of the evaluation revealed that the model achieved a 96% accuracy and a 95% F1 score. We were able to attain a high level of accuracy, particularly in the positive and negative classes; nevertheless, there were some classification mistakes that occurred due to the use of local language and structural elements. A significant addition to sentiment analysis is shown by the findings, which indicate that taking into account morphological and semantic language aspects is an essential aspect. There is a high probability that more research with bigger datasets and models that are particular to the local language can further enhance performance.

Keywords

Azerbaijani Turkish , Sentiment Analysis , Text Classification , BERT , Natural Language Processing

References

Abdullayeva, Z., & Abbasova, M. (2022). Azərbaycanca metinlərdə duyğu analizi üçün transformer-tabanlı yanaşmaların qiymətləndirilməsi. Qafqaz Nəqliyyat və Texnologiyalar Jurnalı, 5(1), 27–38.
Agarwal, A., & Mittal, S. (2018). Sentiment analysis: A survey. International Journal of Computer Applications, 179(31), 1–5.
Akbarov, M., & Guliyev, A. (2023). Multilingual BERT modeli ilə Azərbaycan türkcəsində duyğu sınıflandırması. Azərbaycan Süni İntellekt və Dillər Araşdırmaları Jurnalı, 4(2), 74–89.
Akbik, A., Bergmann, T., & Vollgraf, R. (2019). FLAIR: An easy-to-use framework for state-of-the-art NLP. In Proceedings of NAACL-HLT 2019 (pp. 54–59).
Bojanowski, P., Grave, E., Mikolov, T., & Joulin, A. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.
Çöltekin, Ç. (2020). A corpus-based study of Turkish language resources for sentiment analysis. Language Resources and Evaluation, 54(4), 1123–1148. https://doi.org/10.1007/s10579-019-09481-5
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT 2019 (pp. 4171–4186).
Gültekin, A., & Özkan, M. (2019). Comparison of word embedding methods for Turkish text classification. In Proceedings of the International Conference on Natural Language Processing and Information Retrieval (pp. 85–92).
Hakkani-Tür, D., & Tür, G. (2017). Text classification using word embeddings for social media analysis. In Proceedings of the 2017 International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction (pp. 150–159).
Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. In Proceedings of ACL 2018 (pp. 328–339).
Huseynov, R., & Karimov, I. (2019). Azerbaycan dilində duyğu analizi üçün yeni yanaşmalar. Azerbaycan Dil Bilimi Dergisi, 25(6), 88–102.
Isayev, S. (2021). Azerbaycan dilində duyğu analizi və dərin öyrənmə. Azerbaycan Bilim ve Teknoloji Dergisi, 18(4), 124–138.
Khanna, S., & Varma, V. (2020). Sentiment analysis for low resource languages: A case study of Punjabi. In Proceedings of the 2020 International Conference on Natural Language Processing (pp. 90–99).
Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
Mammadov, E., & Aliyev, H. (2020). Azerbaycan dilində metin sınıflandırması: BERT modeli ilə təcrübələr. Azerbaycan İnformatika Jurnalı, 10(3), 55–68.
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv. https://arxiv.org/abs/1301.3781
Özyurt, O. (2022). Sentiment classification using hybrid transformer-based models for Turkish language. Turkish Journal of Electrical Engineering & Computer Sciences, 30(2), 1401–1416. https://doi.org/10.55730/1300-0632.3878
Quliyev, N. (2019). Azerbaycan dilində duyğu analizi üzrə tədqiqatlar. Bakı: Bakı Dövlət Universiteti Yayınları.
Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI Blog. https://openai.com/research/language-unsupervised
Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842–866. https://doi.org/10.1162/tacl_a_00349
Sennrich, R., Haddow, B., & Birch, A. (2016). Neural machine translation of rare words with subword units. In Proceedings of ACL 2016 (pp. 1715–1725).
Şahin, C., & Diri, B. (2021). A review on Turkish natural language processing tools and resources. Turkish Journal of Computer and Mathematics Education, 12(2), 131–149.
TensorFlow. (2025). TensorFlow. https://www.tensorflow.org
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Proceedings of NeurIPS 2017 (pp. 5998–6008).
Yılmaz, K., & Çetin, M. (2020). Sentiment analysis for Azerbaijani text using BERT model. In Proceedings of the 2020 International Conference on Artificial Intelligence and Data Engineering (pp. 65–72).
Zhang, L., & Wallace, B. C. (2015). A sensitivity analysis of (and practitioner’s guide to) convolutional neural networks for sentence classification. In Proceedings of EMNLP 2015 (pp. 1431–1440).

There are 26 citations in total.

Details

Primary Language	Turkish
Subjects	Deep Learning, Natural Language Processing, Artificial Intelligence (Other)
Journal Section	Research Article
Authors	Orkhan Valiev 0009-0003-5475-262X Gülsüm Kayabaşı Koru 0000-0002-1749-900X
Early Pub Date	August 9, 2025
Publication Date	November 17, 2025
Submission Date	June 24, 2025
Acceptance Date	August 8, 2025
Published in Issue	Year 2025 Volume: 1 Issue: 2

Cite

APA	Valiev, O., & Kayabaşı Koru, G. (2025). Doğal Dil İşlemede Yerel Dil Zenginliği: Azbert ile Azerbaycan Türkçesi Metinlerinin Anlamsal Analizi. ULUSLARARASI BİLİŞİM SİSTEMLERİ VE UYGULAMALARI DERGİSİ, 1(2), 27-57.

Download Cover Image

Article Files

Full Text

Fee Policy
No fees are charged to authors or their institutions under any circumstances.