Araştırma Makalesi

Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models

Cilt: 5 Sayı: 2 27 Haziran 2026
PDF İndir
EN TR

Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models

Öz

Short Message Service is one of the most widely used communication tools today. This widespread use has also led to an increase in the number of unwanted messages. These messages, sent for the purposes of fraud, advertising, and promotion, are messages that users do not wish to receive. This study addresses the classification of messages received via SMS into spam and non-spam categories. To this end, a project was developed using a specialized SMS dataset. Preprocessing steps were applied to the dataset to remove meaningless features from the text, and then two different classical machine learning algorithms and two different deep learning algorithms were used. These algorithms are Naive Bayes, Random Forest, BERTurk, and Turkish ELECTRA, respectively. Classification models were developed using these algorithms. During the training process, the dataset was split into an 80% training set and a 20% test set; additionally, a 5-fold cross-validation method was applied to verify the stability of the results. The models were trained on the preprocessed data and analyzed and compared using performance metrics such as Precision, Recall, F1-Score, and Accuracy. Analyses conducted on the dataset indicate that the unprocessed BERTurk model, evaluated using 5-fold cross-validation, achieved the best result with an accuracy of 0.990. The results demonstrate that different algorithms offer distinct advantages depending on the data structure.

Anahtar Kelimeler

Etik Beyan

The manuscript does not require approval from an ethics committee. There is no conflict of interest with any individual or institution regarding the manuscript.

Kaynakça

  1. J. Brown, B. Shipman, and R. Vetter, “SMS: The short message service,” Computer, vol. 40, no. 12, pp. 106–110, 2007.
  2. Türkiye İstatistik Kurumu (TÜİK), İstatistiklerle Aile, 2024, [Online]. Accessed: Jun. 27, 2025. Available: https://data.tuik.gov.tr/Bulten/Index?p=Istatistiklerle-Aile-2024-53898
  3. T. A. Almeida, T. P. Silva, I. Santos, and J. M. G. Gómez Hidalgo, “Text normalization and semantic indexing to enhance instant messaging and SMS spam filtering,” Knowl.-Based Syst., vol. 108, pp. 25–32, 2016.
  4. E. Kılıç, S. N. Arslan, and M. A. Güvensan, “3-Tier hybrid approach for SMS filtering,” in Proc. 22nd Signal Process. Commun. Appl. Conf. (SIU), 2014, pp. 1950–1953.
  5. Türkiye Büyük Millet Meclisi (TBMM), “Elektronik Ticaretin Düzenlenmesi Hakkında Kanun (Kanun No: 6563),” T.C. Resmî Gazete, no. 29166, Nov. 5, 2014.
  6. O. Karasoy and S. Ballı, “Classification Turkish SMS with deep learning tool Word2Vec,” in Proc. Int. Conf. Comput. Sci. Eng. (UBMK), 2017, pp. 294–297.
  7. H. C. Altunay and Z. Albayrak, “SMS spam detection system based on deep learning architectures for Turkish and English messages,” Appl. Sci., vol. 14, no. 24, p. 11804, 2024.
  8. D. A. Oyeyemi and A. K. Ojo, “SMS Spam Detection and Classification to Combat Abuse in Telephone Networks Using Natural Language Processing,” arXiv preprint arXiv:2406.06578, 2024.

Ayrıntılar

Birincil Dil

İngilizce

Konular

Bilgisayar Yazılımı

Bölüm

Araştırma Makalesi

Yayımlanma Tarihi

27 Haziran 2026

Gönderilme Tarihi

26 Şubat 2026

Kabul Tarihi

3 Mayıs 2026

Yayımlandığı Sayı

Yıl 2026 Cilt: 5 Sayı: 2

Kaynak Göster

APA
Kıvrak, B., Arzu, M., Kaya, M., Aydoğan, M., & Santur, Y. (2026). Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models. Firat University Journal of Experimental and Computational Engineering, 5(2), 438-453. https://doi.org/10.62520/fujece.1892277
AMA
1.Kıvrak B, Arzu M, Kaya M, Aydoğan M, Santur Y. Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models. Firat University Journal of Experimental and Computational Engineering. 2026;5(2):438-453. doi:10.62520/fujece.1892277
Chicago
Kıvrak, Buğra, Mehmet Arzu, Mahmut Kaya, Murat Aydoğan, ve Yunus Santur. 2026. “Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models”. Firat University Journal of Experimental and Computational Engineering 5 (2): 438-53. https://doi.org/10.62520/fujece.1892277.
EndNote
Kıvrak B, Arzu M, Kaya M, Aydoğan M, Santur Y (01 Haziran 2026) Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models. Firat University Journal of Experimental and Computational Engineering 5 2 438–453.
IEEE
[1]B. Kıvrak, M. Arzu, M. Kaya, M. Aydoğan, ve Y. Santur, “Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models”, Firat University Journal of Experimental and Computational Engineering, c. 5, sy 2, ss. 438–453, Haz. 2026, doi: 10.62520/fujece.1892277.
ISNAD
Kıvrak, Buğra - Arzu, Mehmet - Kaya, Mahmut - Aydoğan, Murat - Santur, Yunus. “Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models”. Firat University Journal of Experimental and Computational Engineering 5/2 (01 Haziran 2026): 438-453. https://doi.org/10.62520/fujece.1892277.
JAMA
1.Kıvrak B, Arzu M, Kaya M, Aydoğan M, Santur Y. Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models. Firat University Journal of Experimental and Computational Engineering. 2026;5:438–453.
MLA
Kıvrak, Buğra, vd. “Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models”. Firat University Journal of Experimental and Computational Engineering, c. 5, sy 2, Haziran 2026, ss. 438-53, doi:10.62520/fujece.1892277.
Vancouver
1.Buğra Kıvrak, Mehmet Arzu, Mahmut Kaya, Murat Aydoğan, Yunus Santur. Classification of Spam Content in Turkish Short Messages Using Transformer-Based Models. Firat University Journal of Experimental and Computational Engineering. 01 Haziran 2026;5(2):438-53. doi:10.62520/fujece.1892277