Review Article
BibTex RIS Cite

Türkçe Haber Metinlerinin Derin Öğrenme Modelleriyle Sınıflandırılması: Karşılaştırmalı Bir İnceleme

Year 2025, Volume: 18 Issue: 2, 124 - 132, 22.12.2025
https://doi.org/10.54525/bbmd.1708770

Abstract

Yapay zekânın bir kolu olan doğal dil işleme, metin verilerinin üretilmesine, sınıflandırılmasına ve işlenmesine imkân verir. Doğal dil işleme alanlarından metin sınıflandırma, metinlerden anlamlı bilgiler çıkarılmasında önemli bir yere sahiptir. Bu çalışmada literatürde farklı diller için başarılı sınıflandırma performansı gösteren derin öğrenme modelleri kullanılarak Türkçe haber metinleri üzerinde sınıflandırma işlemi gerçekleştirilmiş ve elde edilen sonuçlar karşılaştırılmıştır. Kullanılan veri seti yaklaşık 25 bin Türkçe haber metninden oluşmaktadır. Veri setinde bulunan metinler farklı derin öğrenme mimarileri ve transformer modeller ile önceden belirlenmiş haber kategorilerine göre sınıflandırılmışlardır. Çalışma sonucunda, BERT mimarisi ile %92,40 doğruluk oranı ile en yüksek performansı sergilediği, hiyerarşik LSTM üzerine dikkat mekanizması eklenerek geliştirilen H-LSTM-ATT modelinin %91,52 doğruluk ile dikkat çekici bir başarı sağladığı, en düşük doğruluk oranının ise derin 2D CNN modelinde (%89,35) elde edildiği görülmüştür. Özellikle transformer modellerinin ve dikkat mekanizması eklenmiş hibrit derin öğrenme mimarilerinin Türkçe metin sınıflandırma görevlerinde diğer modellere kıyasla daha başarılı olduğu sonucuna varılmıştır.

References

  • Sebastiani, F. (2002). “Machine learning in automated text categorization”. ACM computing surveys (CSUR), 34(1), 1-47.
  • Ho C.C., Baharim K.N., Fatan A.A.A. and Alias M.S.B. (2017). “Deep neural networks for text: A review”. In The 6th International Conference on Computer Science and Computational Mathematics. Langkawi, Malaysia.
  • Alparslan, G., Dursun, M. (2023). Konvolüsyonel Sinir Ağları Tabanlı Türkçe Metin Sınıflandırma. Bilişim Teknolojileri Dergisi, 16(1), 21-31.
  • Yıldırım, S. and Yıldız, T. (2018). “A comparative analysis of text classification for Turkish language”. Pamukkale University Journal of Engineering Sciences, 24(5), 879-886.
  • Acı, Ç. ve Çırak, A. (2019). “Türkçe haber metinlerinin konvolüsyonel sinir ağları ve Word2Vec kullanılarak sınıflandırılması”. Bilişim Teknolojileri Dergisi, 12(3), 219-228.
  • Kilimci, Z.H. and Akyokuş, S. (2019, September). “The evaluation of word embedding models and deep learning algorithms for Turkish text classification”. In 2019 4th International Conference on Computer Science and Engineering (UBMK) (pp. 548-553). Ieee.
  • Nergiz, G., Safalı, Y., Avaroğlu, E. and Erdoğan, S. (2019, September). “Classification of Turkish news content by deep learning based LSTM using Fasttext model”. In 2019 International Artificial Intelligence and Data Processing Symposium (IDAP) (pp. 1-6). IEEE.
  • Köksal, Ö. and Akgül, Ö. (2022, March). “A comparative text classification study with deep learning-based algorithms”. In 2022 9th International Conference on Electrical and Electronics Engineering (ICEEE) (pp. 387-391). IEEE.
  • Arzu, M., Aydoğan, M. (2025). Comparison of Transformer-Based Turkish Models for Question-Answering Task. Balkan Journal of Electrical and Computer Engineering, 12(4), 387-393.
  • M. İncidelen and M. Aydoğan, ‘Developing Question-Answering Models in Low-Resource Languages: A Case Study on Turkish Medical Texts Using Transformer-Based Approaches’, in 2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP), Sep. 2024, pp. 1–4. doi: 10.1109/IDAP64064.2024.10711128.
  • Şahin, G., & Diri, B. (2021, June). The effect of transfer learning on Turkish text classification. In 2021 29th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE.
  • Yüksel, A. E., Türkmen, Y. A., Özgür, A., & Altınel, B. (2019, September). Turkish tweet classification with transformer encoder. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019) (pp. 1380-1387).
  • Yıldırım, O. ve Atık, F., (2013). Kişisel Gazete, Bitirme Projesi, Yıldız Teknik Üniversitesi, İstanbul.
  • Jang, B., Kim, M., H.G., Kang, S. and Kim, J.W. (2020). “Bi-LSTM model to increase accuracy in text classification: Combining Word2vec CNN and attention mechanism”. Applied Sciences, 10(17), 5841.
  • Harish B. S., Guru, D.S. and Shantharamu, M. (2010). “Representation and classification of text documents: A brief review”. IJCA, Special Issue on RTIPPR (2), 110, 119.
  • Liu, Z., Lin, Y., Sun, M., Liu, Z., Lin, Y., & Sun, M. (2020). Representation learning and NLP. Representation Learning for Natural Language Processing, 1-11.
  • Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
  • [18] Charniak, E., (2018). Introduction to Deep Learning. The MIT Press. England.
  • Hochreiter, S. and Schmidhuber, J. (1997). “Long short-term memory”. Neural computation, 9(8), 1735-1780.
  • Liu, Y., Ma, J., Tao, Y., Shi, L., Wei, L., & Li, L. (2020, December). Hybrid neural network text classification combining tcn and gru. In 2020 IEEE 23rd international conference on computational science and engineering (CSE) (pp. 30-35). IEEE.
  • Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016, June). Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (pp. 1480-1489).
  • Goodfellow, I., Bengio, Y. and Courville, A. (2016). Deep learning. MIT press.
  • Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., & Xu, B. (2016). Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. arXiv preprint arXiv:1611.06639.
  • Conneau, A., Schwenk, H., Barrault, L., & Lecun, Y. (2016). Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781.
  • Johnson, R., & Zhang, T. (2017, July). Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 562-570).
  • Lai, S., Xu, L., Liu, K., & Zhao, J. (2015, February). Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI conference on artificial intelligence (Vol. 29, No. 1).
  • Emeksiz, C., & Fındık, M. M. (2022). Hybrid Estimation Model (CNN-GRU) Based on Deep Learning for Wind Speed Estimation. International Journal of Multidisciplinary Studies and Innovative Technologies, 6(1), 104-112.Niu Zhaoyang, Zhong Guoqiang and Yu Hui (2021). “A review on the attention mechanism of deep learning”. Neurocomputing, 452, 48-62.
  • Niu Zhaoyang, Zhong Guoqiang and Yu Hui (2021). “A review on the attention mechanism of deep learning”. Neurocomputing, 452, 48-62.
  • Onan, A. (2022). Türkçe Metin Madenciliği için Dikkat Mekanizması Tabanlı Derin Öğrenme Mimarilerinin Değerlendirilmesi. Avrupa Bilim Ve Teknoloji Dergisi, (34), 403-407.
  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019, June). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) (pp. 4171-4186).
  • Zheng, X., Zhang, C., & Woodland, P. C. (2021, December). Adapting GPT, GPT-2 and BERT language models for speech recognition. In 2021 IEEE Automatic speech recognition and understanding workshop (ASRU) (pp. 162-168). IEEE.
  • Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.

Classification of Turkish News Texts with Deep Learning Models: A Comparative Review

Year 2025, Volume: 18 Issue: 2, 124 - 132, 22.12.2025
https://doi.org/10.54525/bbmd.1708770

Abstract

Natural language processing, a branch of artificial intelligence, enables the generation, classification, and processing of text data. Text classification, a branch of natural language processing, plays a crucial role in extracting meaningful information from texts. In this study, classification was performed on Turkish news texts using deep learning models that have demonstrated successful classification performance for different languages in the literature, and the results were compared. The dataset used consisted of approximately 25,000 Turkish news texts. Before classification, the texts were normalized through data preprocessing steps and vectorized using the GloVe word representation method. Text classification was then performed using the designed deep learning models As a result of the study, it was seen that the BERT architecture exhibited the highest performance with an accuracy rate of 92.40%, the H-LSTM-ATT model, developed by adding attention mechanism on hierarchical LSTM, achieved a remarkable success with an accuracy of 91.52%, and the lowest accuracy rate was obtained in the deep 2D CNN model (89.35%). It has been concluded that especially transformer architectures and hybrid deep learning models with added attention mechanism are more successful than other models in Turkish text classification tasks.

References

  • Sebastiani, F. (2002). “Machine learning in automated text categorization”. ACM computing surveys (CSUR), 34(1), 1-47.
  • Ho C.C., Baharim K.N., Fatan A.A.A. and Alias M.S.B. (2017). “Deep neural networks for text: A review”. In The 6th International Conference on Computer Science and Computational Mathematics. Langkawi, Malaysia.
  • Alparslan, G., Dursun, M. (2023). Konvolüsyonel Sinir Ağları Tabanlı Türkçe Metin Sınıflandırma. Bilişim Teknolojileri Dergisi, 16(1), 21-31.
  • Yıldırım, S. and Yıldız, T. (2018). “A comparative analysis of text classification for Turkish language”. Pamukkale University Journal of Engineering Sciences, 24(5), 879-886.
  • Acı, Ç. ve Çırak, A. (2019). “Türkçe haber metinlerinin konvolüsyonel sinir ağları ve Word2Vec kullanılarak sınıflandırılması”. Bilişim Teknolojileri Dergisi, 12(3), 219-228.
  • Kilimci, Z.H. and Akyokuş, S. (2019, September). “The evaluation of word embedding models and deep learning algorithms for Turkish text classification”. In 2019 4th International Conference on Computer Science and Engineering (UBMK) (pp. 548-553). Ieee.
  • Nergiz, G., Safalı, Y., Avaroğlu, E. and Erdoğan, S. (2019, September). “Classification of Turkish news content by deep learning based LSTM using Fasttext model”. In 2019 International Artificial Intelligence and Data Processing Symposium (IDAP) (pp. 1-6). IEEE.
  • Köksal, Ö. and Akgül, Ö. (2022, March). “A comparative text classification study with deep learning-based algorithms”. In 2022 9th International Conference on Electrical and Electronics Engineering (ICEEE) (pp. 387-391). IEEE.
  • Arzu, M., Aydoğan, M. (2025). Comparison of Transformer-Based Turkish Models for Question-Answering Task. Balkan Journal of Electrical and Computer Engineering, 12(4), 387-393.
  • M. İncidelen and M. Aydoğan, ‘Developing Question-Answering Models in Low-Resource Languages: A Case Study on Turkish Medical Texts Using Transformer-Based Approaches’, in 2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP), Sep. 2024, pp. 1–4. doi: 10.1109/IDAP64064.2024.10711128.
  • Şahin, G., & Diri, B. (2021, June). The effect of transfer learning on Turkish text classification. In 2021 29th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE.
  • Yüksel, A. E., Türkmen, Y. A., Özgür, A., & Altınel, B. (2019, September). Turkish tweet classification with transformer encoder. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019) (pp. 1380-1387).
  • Yıldırım, O. ve Atık, F., (2013). Kişisel Gazete, Bitirme Projesi, Yıldız Teknik Üniversitesi, İstanbul.
  • Jang, B., Kim, M., H.G., Kang, S. and Kim, J.W. (2020). “Bi-LSTM model to increase accuracy in text classification: Combining Word2vec CNN and attention mechanism”. Applied Sciences, 10(17), 5841.
  • Harish B. S., Guru, D.S. and Shantharamu, M. (2010). “Representation and classification of text documents: A brief review”. IJCA, Special Issue on RTIPPR (2), 110, 119.
  • Liu, Z., Lin, Y., Sun, M., Liu, Z., Lin, Y., & Sun, M. (2020). Representation learning and NLP. Representation Learning for Natural Language Processing, 1-11.
  • Pennington, J., Socher, R., & Manning, C. D. (2014, October). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
  • [18] Charniak, E., (2018). Introduction to Deep Learning. The MIT Press. England.
  • Hochreiter, S. and Schmidhuber, J. (1997). “Long short-term memory”. Neural computation, 9(8), 1735-1780.
  • Liu, Y., Ma, J., Tao, Y., Shi, L., Wei, L., & Li, L. (2020, December). Hybrid neural network text classification combining tcn and gru. In 2020 IEEE 23rd international conference on computational science and engineering (CSE) (pp. 30-35). IEEE.
  • Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016, June). Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (pp. 1480-1489).
  • Goodfellow, I., Bengio, Y. and Courville, A. (2016). Deep learning. MIT press.
  • Zhou, P., Qi, Z., Zheng, S., Xu, J., Bao, H., & Xu, B. (2016). Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. arXiv preprint arXiv:1611.06639.
  • Conneau, A., Schwenk, H., Barrault, L., & Lecun, Y. (2016). Very deep convolutional networks for text classification. arXiv preprint arXiv:1606.01781.
  • Johnson, R., & Zhang, T. (2017, July). Deep pyramid convolutional neural networks for text categorization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (pp. 562-570).
  • Lai, S., Xu, L., Liu, K., & Zhao, J. (2015, February). Recurrent convolutional neural networks for text classification. In Proceedings of the AAAI conference on artificial intelligence (Vol. 29, No. 1).
  • Emeksiz, C., & Fındık, M. M. (2022). Hybrid Estimation Model (CNN-GRU) Based on Deep Learning for Wind Speed Estimation. International Journal of Multidisciplinary Studies and Innovative Technologies, 6(1), 104-112.Niu Zhaoyang, Zhong Guoqiang and Yu Hui (2021). “A review on the attention mechanism of deep learning”. Neurocomputing, 452, 48-62.
  • Niu Zhaoyang, Zhong Guoqiang and Yu Hui (2021). “A review on the attention mechanism of deep learning”. Neurocomputing, 452, 48-62.
  • Onan, A. (2022). Türkçe Metin Madenciliği için Dikkat Mekanizması Tabanlı Derin Öğrenme Mimarilerinin Değerlendirilmesi. Avrupa Bilim Ve Teknoloji Dergisi, (34), 403-407.
  • Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019, June). Bert: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers) (pp. 4171-4186).
  • Zheng, X., Zhang, C., & Woodland, P. C. (2021, December). Adapting GPT, GPT-2 and BERT language models for speech recognition. In 2021 IEEE Automatic speech recognition and understanding workshop (ASRU) (pp. 162-168). IEEE.
  • Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
There are 32 citations in total.

Details

Primary Language Turkish
Subjects Decision Support and Group Support Systems
Journal Section Review Article
Authors

Nihal Duman Suna

Oğuz Kaynar

Submission Date May 29, 2025
Acceptance Date October 9, 2025
Early Pub Date December 16, 2025
Publication Date December 22, 2025
Published in Issue Year 2025 Volume: 18 Issue: 2

Cite

IEEE N. Duman Suna and O. Kaynar, “Türkçe Haber Metinlerinin Derin Öğrenme Modelleriyle Sınıflandırılması: Karşılaştırmalı Bir İnceleme”, Bilgisayar Bilimleri ve Mühendisliği Dergisi, vol. 18, no. 2, pp. 124–132, 2025, doi: 10.54525/bbmd.1708770.