Research Article

Multi-Class News Classification with BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models

Volume: 14 Number: 1 January 21, 2026
TR EN

Multi-Class News Classification with BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models

Abstract

The rapid proliferation of digital news sources today necessitates the effective analysis and classification of large-scale textual data. In this study, BERT (Bidirectional Encoder Representations from Transformers) and its derivatives — DistilBERT, RoBERTa, and ELECTRA — were comparatively evaluated for the automatic classification of multi-class news texts. Each model performed the classification task by learning the contextual and semantic features of texts belonging to different news categories. The models’ performances were analyzed based on various metrics such as accuracy, precision, recall, and F1 score. Among them, the DistilBERT model demonstrated the best performance, achieving an accuracy of 0.92 and a mean F1 score of 0.92. The findings reveal that transformer-based models exhibit strong performance in news classification tasks and further illustrate the impact of architectural differences among these models on classification success. Accordingly, important insights have been gained regarding the practical effectiveness of different language model architectures.

Keywords

Supporting Institution

This research received no external funding.

Ethical Statement

This study does not involve human or animal participants. All procedures followed scientific and ethical principles, and all referenced studies are appropriately cited.

Thanks

The authors do not wish to acknowledge any individual or institution.

References

  1. Anand, S., & Prakasam, P. (2024). Deep learning-based text news classification using bi-directional LSTM model. In 2024 3rd International Conference on Artificial Intelligence for Internet of Things (AIIoT 2024). https://doi.org/10.1109/AIIoT58432.2024.10574679
  2. Aydın, Ö., & Kantarcı, H. (2024). Türkçe anahtar sözcük çıkarımında LSTM ve BERT tabanlı modellerin karşılaştırılması. Bilgisayar Bilimleri ve Mühendisliği Dergisi, 17(1), 9–18. https://doi.org/10.54525/bbmd.1454220
  3. Clark, K., Luong, M.-T., Le, Q. V., & Manning, C. D. (2020). ELECTRA: Pre-training text encoders as discriminators rather than generators. arXiv Preprint arXiv:2003.10555. http://arxiv.org/abs/2003.10555
  4. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1, 4171–4186. https://doi.org/10.48550/arXiv.1810.04805
  5. Dos Santos, D. P., Da Costa, J. P. J., Da Silva, D. A., Mendonca, F., Veiga, C., & De Sousa, R. T. (2023). Multi-class text classification based in oversampling for highly imbalanced dataset. Proceedings of the 22nd IEEE International Conference on Machine Learning and Applications (ICMLA 2023) (pp. 752–755). https://doi.org/10.1109/ICMLA58977.2023.00109
  6. Dvořáčková, L. (2025). Analyzing word embeddings and their impact on semantic similarity: Through extreme simulated conditions to real dataset characteristics. Neural Computing and Applications, 37(19), 13765–13793. https://doi.org/10.1007/S00521-025-11231-4
  7. Hassan, S. U., Ahamed, J., & Ahmad, K. (2022). Analytics of machine learning-based algorithms for text classification. Sustainable Operations and Computers, 3, 238–248. https://doi.org/10.1016/j.susoc.2022.03.001
  8. İzdaş, T., İskifoğlu, H., & Diri, B. (2025). Occupation prediction from twitter data. Dokuz Eylül Üniversitesi Mühendislik Fakültesi Fen ve Mühendislik Dergisi, 27(80), 267–271. https://doi.org/10.21205/deufmd.2025278013

Details

Primary Language

English

Subjects

Deep Learning, Classification Algorithms

Journal Section

Research Article

Publication Date

January 21, 2026

Submission Date

July 7, 2025

Acceptance Date

November 10, 2025

Published in Issue

Year 2026 Volume: 14 Number: 1

APA
Şentürk, A., Albayrak, A., & Arpacı, S. (2026). Multi-Class News Classification with BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models. Duzce University Journal of Science and Technology, 14(1), 117-129. https://doi.org/10.29130/dubited.1737003
AMA
1.Şentürk A, Albayrak A, Arpacı S. Multi-Class News Classification with BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models. DUBİTED. 2026;14(1):117-129. doi:10.29130/dubited.1737003
Chicago
Şentürk, Arafat, Ahmet Albayrak, and Serdar Arpacı. 2026. “Multi-Class News Classification With BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models”. Duzce University Journal of Science and Technology 14 (1): 117-29. https://doi.org/10.29130/dubited.1737003.
EndNote
Şentürk A, Albayrak A, Arpacı S (January 1, 2026) Multi-Class News Classification with BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models. Duzce University Journal of Science and Technology 14 1 117–129.
IEEE
[1]A. Şentürk, A. Albayrak, and S. Arpacı, “Multi-Class News Classification with BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models”, DUBİTED, vol. 14, no. 1, pp. 117–129, Jan. 2026, doi: 10.29130/dubited.1737003.
ISNAD
Şentürk, Arafat - Albayrak, Ahmet - Arpacı, Serdar. “Multi-Class News Classification With BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models”. Duzce University Journal of Science and Technology 14/1 (January 1, 2026): 117-129. https://doi.org/10.29130/dubited.1737003.
JAMA
1.Şentürk A, Albayrak A, Arpacı S. Multi-Class News Classification with BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models. DUBİTED. 2026;14:117–129.
MLA
Şentürk, Arafat, et al. “Multi-Class News Classification With BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models”. Duzce University Journal of Science and Technology, vol. 14, no. 1, Jan. 2026, pp. 117-29, doi:10.29130/dubited.1737003.
Vancouver
1.Arafat Şentürk, Ahmet Albayrak, Serdar Arpacı. Multi-Class News Classification with BERT, DistilBERT, RoBERTa, and ELECTRA Natural Language Processing Models. DUBİTED. 2026 Jan. 1;14(1):117-29. doi:10.29130/dubited.1737003