adv. artif. intell. res.

Advances in Artificial Intelligence Research

2757-7422

Osman ÖZKARACA

10.54569/aair.1549781

Machine Learning (Other) Natural Language Processing

Makine Öğrenme (Diğer) Doğal Dil İşleme

A Comparative Study of Machine Learning Classifiers for Different Language Spam SMS Detection: Performance Evaluation and Analysis

https://orcid.org/0009-0009-7647-0731

Dev Sharma

Samrat Kumar

Jagannath University

12 30 2024

4 2 69 77 09 13 2024 12 28 2024

2020

Advances in Artificial Intelligence Research

With the continuous rise in the number of mobile device users, SMS (Short Message Service) remains a prevalent communication tool accessible on both smartphones and basic phones. Consequently, SMS traffic has experienced a significant surge. This increase has also led to a rise in spam messages, as spammers seek financial or business gains through activities like marketing promotions, lottery scams, and credit card information theft. Consequently, spam classification has become a focal point of research. In this paper, we explore the effectiveness of 11 machine learning algorithms for SMS spam detection, including multinomial Naïve Bayes, K-Nearest Neighbors (KNN), and Random Forest, among others. Utilizing datasets from UCI and Bangla SMS collections, our experimental results reveal that the multinomial Naïve Bayes algorithm surpasses previous models in spam detection, achieving accuracies of 98.65% and 89.10% in the respective datasets.

Spam SMS Detection NLP Machine Learning Deep Learning Naïve Bayes

A. Alli and S. Misra, "A deep learning method for automatic SMS spam classification: Performance of learning algorithms on indigenous dataset," Concurrency and Computation: Practice and Experience, vol. 34, p. 34, 2022.

S. D. Gupta, S. Saha and S. K. Das, "SMS spam detection using machine learning," in Journal of Physics: Conference Series, 2021.

T. Almeida and J. Hidalgo, "SMS Spam Collection," 2011.

X. Liu, H. Lu and A. Nayak, "A Spam Transformer Model for SMS Spam Detection," IEEE Access, vol. 9, pp. 80253-80263, 2021.

S. Gadde, A. Lakshmanarao and S. Satyanarayana, "SMS Spam Detection using Machine Learning and Deep Learning Techniques," 2021 7th International Conference on Advanced Computing and Communication Systems (ICACCS), vol. 1, pp. 358-362, 2021.

P. J. Yerima and S, "A comparative study of word embedding techniques for SMS spam detection," 2022 14th International Conference on Computational Intelligence and Communication Networks (CICN), pp. 149-155, 2022.

D. Suleiman and G. Al-Naymat, "SMS spam detection using H2O framework," Procedia computer science 113, pp. 154-161, 2017.

G. L. Haq, S. Nazir and H. U. Khan, "Spam Detection Approach for Secure Mobile Message Communication Using Machine Learning Algorithms," Secur. Commun. Networks, vol. 2020, pp. 8873639:1-8873639:6, 2020.

L. P. Lim and M. M. Singh, "Resolving the imbalance issue in short messaging service spam dataset using cost-sensitive techniques," Journal of Information Security and Applications, vol. 54, p. 102558, 2020.

E. Wijaya, G. Noveliora, K. D. Utami, Rojali and G. Z. Nabiilah, "Spam Detection in Short Message Service (SMS) Using Naïve Bayes, SVM, LSTM, and CNN," 2023 10th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), pp. 431-436, 2023.

E. Sankar, "Sms Spam Detection Using Machine Learning," Interantional Journal Of Scientific Research In Engineering And Management, 2023.

Mahadev and H. Jain, "An Analysis of SMS Spam Detection using Machine Learning Model," 2022 Fifth International Conference on Computational Intelligence and Communication Technologies (CCICT), pp. 151-156, 2022.

A. A. M. Tasmia, A. A. N. ,. Jidney and Z. M. A. M. Haque, "Ensemble Approach to Classify Spam SMS from Bengali Text," in Springer Nature, kolkata, 2023.

F. Khan, R. Mustafa, F. Tasnim, T. Mahmud, M. S. Hossain and K. Andersson, "Exploring BERT and ELMo for Bangla Spam SMS Dataset Creation and Detection," in 2023 26th International Conference on Computer and Information Technology (ICCIT), 2023.

R. G. d. Luna, V. C. Magnaye, R. A. L. Reaño, K. L. Enriquez, D. Astorga, T. Celestial, A. M. Española, B. A. Lanting, D. Mugar, M. Ramos and J. Redondo, "A Machine Learning Approach for Efficient Spam Detection in Short Messaging System (SMS)," TENCON 2023 - 2023 IEEE Region 10 Conference (TENCON), pp. 53-58, 2023.

R. G. d. L. Redondo, V. C. Magnaye, R. A. L. Reaño, K. L. E. a. D. Astorga, T. Celestial, A. M. Española, B. A. Lanting, D. Mugar, M. Ramos and Jenjazel, "A Machine Learning Approach for Efficient Spam Detection in Short Messaging System (SMS)," TENCON 2023 - 2023 IEEE Region 10 Conference (TENCON), pp. 53-58, 2023.

Ojo and D. A. Oyeyemi, "SMS Spam Detection and Classification to Combat Abuse in Telephone Networks Using Natural Language Processing," Journal of Advances in Mathematics and Computer Science, 2023.

S. Alghazzawi and D. Alqahtani, "A survey of Emerging Techniques in Detecting SMS Spam," Transactions on Machine Learning and Artificial Intelligence, 2019.

U. M. Kundi, S. Rehman, T. Ali, K. Mahmood and T. Alsaedi, "An Intelligent Framework Based on Deep Learning for SMS and e-mail Spam Detection," Appl. Comput. Intell. Soft Comput., vol. 2023, pp. 6648970:1-6648970:16, 2023.

M. Gupta, A. Bakliwal, S. Agarwal and P. Mehndiratta, "A comparative study of spam SMS detection using machine learning classifiers," in IEEE, 2018.

T. A. H. Almeida and Y. A. Jos'e Maria G, "Contributions to the study of SMS spam filtering: new collection and results," in Association for Computing Machinery, New York, NY, USA, 2011.

Bashar and S. Yerima, "Semi-supervised novelty detection with one class SVM for SMS spam detection," 2022 29th International Conference on Systems, Signals and Image Processing (IWSSIP), Vols. CFP2255E-ART, pp. 1-4, 2022.

S. Gadde, A. Lakshmanarao and S. Satyanarayana, "SMS spam detection using machine learning and deep learning techniques," 2021 7th international conference on advanced computing and communication systems (ICACCS), vol. 1, pp. 358-362, 2021.

E. W. Nabiilah, G. Noveliora, K. D. Utami, Rojali and G. Zain, "Spam Detection in Short Message Service (SMS) Using Naïve Bayes, SVM, LSTM, and CNN," 2023 10th International Conference on Information Technology, Computer, and Electrical Engineering (ICITACEE), pp. 431-436, 2023.

P. K. Roy, J. P. Singh and S. Banerjee, "Deep learning to filter SMS Spam," Future Generation Computer Systems, vol. 102, pp. 524-533, 2020.

S. Yadav and A., "Mobile SMS Spam Filtering for Nepali Text Using Naïve Bayesian and Support Vector Machine," International Journal of Intelligent Systems, vol. 04, pp. 24-28, 2014.