A Novel Ensemble Learning-Based Machine Learning Model for Phishing Attack
Abstract
In today's world, the internet is increasingly effective in every aspect of our lives. The internet, which provides countless advantages when used consciously, also carries many dangers in its other aspect. One of these dangers and the most important one is the possibility of being targeted by malicious people while using the internet. Attackers can deceive innocent people by directing them to fake, misleading websites to obtain our important information and data. With this type of attack, known as phishing attack, internet users can provide their information and data to attackers. In this study, we propose a new ensemble learning-based machine learning model with feature selection methods to detect phishing attacks. We also try two feature selection algorithms to increase the classification success of the model and analyze the effects of these algorithms on the classification success. After the feature selection algorithms, the dataset with the selected features was trained with a new ensemble learning model that we created with the voting classifier method using XGBoost, CatBoost, LightGBM algorithms. The proposed model was analyzed using widely used performance evaluation metrics, achieving an accuracy of 97.96%. It was observed that the proposed model outperforms the studies in the literature using the same dataset.
Keywords
References
- [1] A. Basit, M. Zafar, X. Liu, A. R. Javed, Z. Jalil, and K. Kifayat, “A comprehensive survey of AI-enabled phishing attacks detection techniques,” Telecommunication Systems, vol. 76, no. 1, pp. 139–154, Jan. 2021, doi: 10.1007/s11235-020-00733-2.
- [2] S. Gupta, A. Singhal, and A. Kapoor, “A literature survey on social engineering attacks: Phishing attack,” in 2016 International Conference on Computing, Communication and Automation (ICCCA), 2016.
- [3] A. Almomani, M. Alauthman, M. T. Shatnawi, M. Alweshah, A. Alrosan, and B. B. Gupta, “Phishing website detection with semantic features based on machine learning classifiers: a comparative study,” International Journal on Semantic Web and Information Systems, vol. 18, no. 1, 2022.
- [4] M. F. B. Karim, T. Hasan, N. Tazreen, S. B. Hakim, and S. Tarannum, “An investigation of ML techniques to detect phishing websites by complexity reduction,” in 2022 IEEE International Conference on Cybernetics and Computational Intelligence, 2022.
- [5] A. Subasi and E. Kremic, “Comparison of adaboost with multiboosting for phishing website detection,” Procedia Computer Science, 2020.
- [6] A. Karakaya and A. Ulu, “A novel model based on ensemble learning for phishing attack,” Düzce Üniversitesi Bilim ve Teknoloji Dergisi, 2024.
- [7] E. Aslan and Y. Özüpak, “Comparison of machine learning algorithms for automatic prediction of Alzheimer disease,” Journal of the Chinese Medical Association, vol. 88, no. 2, pp. 98–107, 2025.
- [8] Y. Özüpak, F. Alpsalaz, and E. Aslan, “Air quality forecasting using machine learning: Comparative analysis and ensemble strategies for enhanced prediction,” Water, Air, & Soil Pollution, vol. 236, no. 7, p. 464, 2025.
Details
Primary Language
English
Subjects
Software Engineering (Other)
Journal Section
Research Article
Authors
Ekrem Baser
*
0000-0002-8233-7840
Türkiye
Publication Date
December 31, 2025
Submission Date
May 8, 2025
Acceptance Date
November 18, 2025
Published in Issue
Year 2025 Volume: 13 Number: 4
