Research Article
BibTex RIS Cite

Doğal Dil İşlemede Yerel Dil Zenginliği: Azbert ile Azerbaycan Türkçesi Metinlerinin Anlamsal Analizi

Year 2025, Volume: 1 Issue: 2, 27 - 57

Abstract

Azerbaycan Türkçesinde metin sınıflandırma ve duygu analizi görevlerini yerine getirmek amacıyla tasarlanan BERT tabanlı modelin amacı, metinleri olumlu, olumsuz veya tarafsız olarak sınıflandırmaktır. Dönüştürücü mimarisine dayanan modelin eğitimi, daha önce eğitilmiş olan BERT modeli üzerinde ince ayar yapılarak gerçekleştirilir. Bu eğitimde dilin morfolojik zenginliği ve yerel ifadeleri dikkate alınmaktadır. Modelin etkinliğini değerlendirmek için doğruluk, kesinlik, geri çağırma ve F1 puanı gibi performans ölçütleri kullanılmıştır. Değerlendirme sonuçları, modelin %96 doğruluk ve %95 F1 skoru elde ettiğini ortaya koymuştur. Özellikle pozitif ve negatif sınıflarda yüksek bir doğruluk seviyesine ulaşılmıştır; ancak yerel dil ve yapısal unsurların kullanımı nedeniyle bazı sınıflandırma hataları meydana gelmiştir. Morfolojik ve semantik dil unsurlarının dikkate alınmasının önemli bir husus olduğunu gösteren bulgular, duygu analizine önemli bir katkı sağlamaktadır. Daha büyük veri kümeleri ve yerel dile özgü modellerle yapılacak daha fazla araştırmanın performansı daha da artırma olasılığı yüksektir.

References

  • Abdullayeva, Z., & Abbasova, M. (2022). Azərbaycanca metinlərdə duyğu analizi üçün transformer-tabanlı yanaşmaların qiymətləndirilməsi. Qafqaz Nəqliyyat və Texnologiyalar Jurnalı, 5(1), 27–38.
  • Agarwal, A., & Mittal, S. (2018). Sentiment analysis: A survey. International Journal of Computer Applications, 179(31), 1–5.
  • Akbarov, M., & Guliyev, A. (2023). Multilingual BERT modeli ilə Azərbaycan türkcəsində duyğu sınıflandırması. Azərbaycan Süni İntellekt və Dillər Araşdırmaları Jurnalı, 4(2), 74–89.
  • Akbik, A., Bergmann, T., & Vollgraf, R. (2019). FLAIR: An easy-to-use framework for state-of-the-art NLP. In Proceedings of NAACL-HLT 2019 (pp. 54–59).
  • Bojanowski, P., Grave, E., Mikolov, T., & Joulin, A. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.
  • Çöltekin, Ç. (2020). A corpus-based study of Turkish language resources for sentiment analysis. Language Resources and Evaluation, 54(4), 1123–1148. https://doi.org/10.1007/s10579-019-09481-5
  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT 2019 (pp. 4171–4186).
  • Gültekin, A., & Özkan, M. (2019). Comparison of word embedding methods for Turkish text classification. In Proceedings of the International Conference on Natural Language Processing and Information Retrieval (pp. 85–92).
  • Hakkani-Tür, D., & Tür, G. (2017). Text classification using word embeddings for social media analysis. In Proceedings of the 2017 International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction (pp. 150–159).
  • Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. In Proceedings of ACL 2018 (pp. 328–339).
  • Huseynov, R., & Karimov, I. (2019). Azerbaycan dilində duyğu analizi üçün yeni yanaşmalar. Azerbaycan Dil Bilimi Dergisi, 25(6), 88–102.
  • Isayev, S. (2021). Azerbaycan dilində duyğu analizi və dərin öyrənmə. Azerbaycan Bilim ve Teknoloji Dergisi, 18(4), 124–138.
  • Khanna, S., & Varma, V. (2020). Sentiment analysis for low resource languages: A case study of Punjabi. In Proceedings of the 2020 International Conference on Natural Language Processing (pp. 90–99).
  • Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
  • Mammadov, E., & Aliyev, H. (2020). Azerbaycan dilində metin sınıflandırması: BERT modeli ilə təcrübələr. Azerbaycan İnformatika Jurnalı, 10(3), 55–68.
  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv. https://arxiv.org/abs/1301.3781
  • Özyurt, O. (2022). Sentiment classification using hybrid transformer-based models for Turkish language. Turkish Journal of Electrical Engineering & Computer Sciences, 30(2), 1401–1416. https://doi.org/10.55730/1300-0632.3878
  • Quliyev, N. (2019). Azerbaycan dilində duyğu analizi üzrə tədqiqatlar. Bakı: Bakı Dövlət Universiteti Yayınları.
  • Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI Blog. https://openai.com/research/language-unsupervised
  • Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842–866. https://doi.org/10.1162/tacl_a_00349
  • Sennrich, R., Haddow, B., & Birch, A. (2016). Neural machine translation of rare words with subword units. In Proceedings of ACL 2016 (pp. 1715–1725).
  • Şahin, C., & Diri, B. (2021). A review on Turkish natural language processing tools and resources. Turkish Journal of Computer and Mathematics Education, 12(2), 131–149.
  • TensorFlow. (2025). TensorFlow. https://www.tensorflow.org
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Proceedings of NeurIPS 2017 (pp. 5998–6008).
  • Yılmaz, K., & Çetin, M. (2020). Sentiment analysis for Azerbaijani text using BERT model. In Proceedings of the 2020 International Conference on Artificial Intelligence and Data Engineering (pp. 65–72).
  • Zhang, L., & Wallace, B. C. (2015). A sensitivity analysis of (and practitioner’s guide to) convolutional neural networks for sentence classification. In Proceedings of EMNLP 2015 (pp. 1431–1440).

Natural Language Processing in Azerbaijani: Text Analysis And Artificial Intelligence Applications

Year 2025, Volume: 1 Issue: 2, 27 - 57

Abstract

Classifying texts as either positive, negative, or neutral is the goal of the BERT-based model that was designed for the purpose of performing text classification and sentiment analysis tasks in Azerbaijani Turkish. The training of the model, which is based on transformer architecture, is accomplished by performing fine-tuning on the BERT model that has already been trained. This training takes into consideration the morphological richness and local expressions of the language. Performance metrics such as accuracy, precision, recall, and F1 score were used in order to assess the effectiveness of the model. The results of the evaluation revealed that the model achieved a 96% accuracy and a 95% F1 score. We were able to attain a high level of accuracy, particularly in the positive and negative classes; nevertheless, there were some classification mistakes that occurred due to the use of local language and structural elements. A significant addition to sentiment analysis is shown by the findings, which indicate that taking into account morphological and semantic language aspects is an essential aspect. There is a high probability that more research with bigger datasets and models that are particular to the local language can further enhance performance.

References

  • Abdullayeva, Z., & Abbasova, M. (2022). Azərbaycanca metinlərdə duyğu analizi üçün transformer-tabanlı yanaşmaların qiymətləndirilməsi. Qafqaz Nəqliyyat və Texnologiyalar Jurnalı, 5(1), 27–38.
  • Agarwal, A., & Mittal, S. (2018). Sentiment analysis: A survey. International Journal of Computer Applications, 179(31), 1–5.
  • Akbarov, M., & Guliyev, A. (2023). Multilingual BERT modeli ilə Azərbaycan türkcəsində duyğu sınıflandırması. Azərbaycan Süni İntellekt və Dillər Araşdırmaları Jurnalı, 4(2), 74–89.
  • Akbik, A., Bergmann, T., & Vollgraf, R. (2019). FLAIR: An easy-to-use framework for state-of-the-art NLP. In Proceedings of NAACL-HLT 2019 (pp. 54–59).
  • Bojanowski, P., Grave, E., Mikolov, T., & Joulin, A. (2017). Enriching word vectors with subword information. Transactions of the Association for Computational Linguistics, 5, 135–146.
  • Çöltekin, Ç. (2020). A corpus-based study of Turkish language resources for sentiment analysis. Language Resources and Evaluation, 54(4), 1123–1148. https://doi.org/10.1007/s10579-019-09481-5
  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of NAACL-HLT 2019 (pp. 4171–4186).
  • Gültekin, A., & Özkan, M. (2019). Comparison of word embedding methods for Turkish text classification. In Proceedings of the International Conference on Natural Language Processing and Information Retrieval (pp. 85–92).
  • Hakkani-Tür, D., & Tür, G. (2017). Text classification using word embeddings for social media analysis. In Proceedings of the 2017 International Conference on Social Computing, Behavioral-Cultural Modeling, and Prediction (pp. 150–159).
  • Howard, J., & Ruder, S. (2018). Universal language model fine-tuning for text classification. In Proceedings of ACL 2018 (pp. 328–339).
  • Huseynov, R., & Karimov, I. (2019). Azerbaycan dilində duyğu analizi üçün yeni yanaşmalar. Azerbaycan Dil Bilimi Dergisi, 25(6), 88–102.
  • Isayev, S. (2021). Azerbaycan dilində duyğu analizi və dərin öyrənmə. Azerbaycan Bilim ve Teknoloji Dergisi, 18(4), 124–138.
  • Khanna, S., & Varma, V. (2020). Sentiment analysis for low resource languages: A case study of Punjabi. In Proceedings of the 2020 International Conference on Natural Language Processing (pp. 90–99).
  • Liu, B. (2012). Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1–167. https://doi.org/10.2200/S00416ED1V01Y201204HLT016
  • Mammadov, E., & Aliyev, H. (2020). Azerbaycan dilində metin sınıflandırması: BERT modeli ilə təcrübələr. Azerbaycan İnformatika Jurnalı, 10(3), 55–68.
  • Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv. https://arxiv.org/abs/1301.3781
  • Özyurt, O. (2022). Sentiment classification using hybrid transformer-based models for Turkish language. Turkish Journal of Electrical Engineering & Computer Sciences, 30(2), 1401–1416. https://doi.org/10.55730/1300-0632.3878
  • Quliyev, N. (2019). Azerbaycan dilində duyğu analizi üzrə tədqiqatlar. Bakı: Bakı Dövlət Universiteti Yayınları.
  • Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI Blog. https://openai.com/research/language-unsupervised
  • Rogers, A., Kovaleva, O., & Rumshisky, A. (2020). A primer in BERTology: What we know about how BERT works. Transactions of the Association for Computational Linguistics, 8, 842–866. https://doi.org/10.1162/tacl_a_00349
  • Sennrich, R., Haddow, B., & Birch, A. (2016). Neural machine translation of rare words with subword units. In Proceedings of ACL 2016 (pp. 1715–1725).
  • Şahin, C., & Diri, B. (2021). A review on Turkish natural language processing tools and resources. Turkish Journal of Computer and Mathematics Education, 12(2), 131–149.
  • TensorFlow. (2025). TensorFlow. https://www.tensorflow.org
  • Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Proceedings of NeurIPS 2017 (pp. 5998–6008).
  • Yılmaz, K., & Çetin, M. (2020). Sentiment analysis for Azerbaijani text using BERT model. In Proceedings of the 2020 International Conference on Artificial Intelligence and Data Engineering (pp. 65–72).
  • Zhang, L., & Wallace, B. C. (2015). A sensitivity analysis of (and practitioner’s guide to) convolutional neural networks for sentence classification. In Proceedings of EMNLP 2015 (pp. 1431–1440).
There are 26 citations in total.

Details

Primary Language Turkish
Subjects Deep Learning, Natural Language Processing, Artificial Intelligence (Other)
Journal Section Research Article
Authors

Orkhan Valiev 0009-0003-5475-262X

Gülsüm Kayabaşı Koru 0000-0002-1749-900X

Early Pub Date August 9, 2025
Publication Date November 17, 2025
Submission Date June 24, 2025
Acceptance Date August 8, 2025
Published in Issue Year 2025 Volume: 1 Issue: 2

Cite

APA Valiev, O., & Kayabaşı Koru, G. (2025). Doğal Dil İşlemede Yerel Dil Zenginliği: Azbert ile Azerbaycan Türkçesi Metinlerinin Anlamsal Analizi. ULUSLARARASI BİLİŞİM SİSTEMLERİ VE UYGULAMALARI DERGİSİ, 1(2), 27-57.

Fee Policy
No fees are charged to authors or their institutions under any circumstances.