Research Article

Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models

Volume: 3 Number: 2 December 31, 2025

Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models

Abstract

Phishing attacks remain a significant cybersecurity threat, exploiting users by disguising malicious URLs to resemble legitimate websites. Traditional machine learning approaches often struggle with detecting such threats due to reliance on handcrafted features and limited contextual understanding. This study investigates the effectiveness of advanced transformer-based models i.e., BERT, GPT-2, and XLM-RoBERTa for automatic phishing URL detection. The models were trained and evaluated on a labeled dataset comprising phishing and legitimate URLs, with performance assessed through comprehensive metrics and diagnostic curves. Experimental results showed that XLM-RoBERTa achieved the highest accuracy of 96.1%, outperforming BERT (94.2%) and GPT-2 (92.4%). Precision, recall, F1-score, and AUC-ROC metrics were consistently high across all transformer models, with XLM-RoBERTa demonstrating the most balanced performance. Further evaluation using precision-recall curves, lift and gain charts, calibration curves, and Kolmogorov–Smirnov (KS) plots provided in-depth insights into model discrimination and calibration. These findings underscore the advantages of deep contextualized language models in accurately and reliably detecting phishing URLs, offering a promising approach for enhancing cybersecurity defenses.

Keywords

References

  1. M. Moghimi, A. Y. Varjani, New rule-based phishing detection method, Expert Systems with Applications 53 (2016) 231–242.
  2. K. S. Adewole, A. G. Akintola, S. A. Salihu, N. Faruk, R. G. Jimoh, Hybrid rule-based model for phishing urls detection, in: Emerging Technologies in Computing (iCETiC 2019), 2019.
  3. A. Subasi, E. Molah, F. Almkallawi, T. J. Chaudhery, Intelligent phishing website detection using random forest classifier, in: International Conference on Electrical and Computing Technologies and Applications (ICECTA), 2017.
  4. B. B. Gupta, K. Yadav, I. Razzak, K. Psannis, A. Castiglione, X. Chang, A novel approach for phishing urls detection using lexical based machine learning in a real-time environment, Computer Communications 175 (2021) 47–57.
  5. M. Sanchez-Paniagua, E. F. Fern´andez, E. Alegre, W. Al-Nabki, V. Gonz´alez-Castro, Phishing url detection: A real-case scenario through login urls, IEEE Access 10 (2022) 42949–42960.
  6. S. H. Ahammad, S. D. Kale, G. D. Upadhye, S. D. Pande, E. V. Babu, A. V. Dhumane, M. D. K. J. Bahadur, Phishing url detection using machine learning methods, Advances in Engineering Software 173 (2022) 103288.
  7. S. Jalil, M. Usman, A. Fong, Highly accurate phishing url detection based on machine learning, Journal of Ambient Intelligence and Humanized Computing 14 (7) (2023) 9233–9251.
  8. S. R. A. Samad, S. Balasubaramanian, A. S. Al-Kaabi, B. Sharma, S. Chowdhury, A. Mehbodniya, Analysis of the performance impact of fine-tuned machine learning model for phishing url detection, Electronics 12 (7) (2023) 1642.

Details

Primary Language

English

Subjects

System and Network Security, Natural Language Processing

Journal Section

Research Article

Publication Date

December 31, 2025

Submission Date

June 30, 2025

Acceptance Date

December 30, 2025

Published in Issue

Year 2025 Volume: 3 Number: 2

APA
Ali Shah Zaman, T., Hussain, M., Ali, S., & Fareed, M. A. (2025). Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models. Current Trends in Computing, 3(2), 1-25. https://izlik.org/JA94FR85RZ
AMA
1.Ali Shah Zaman T, Hussain M, Ali S, Fareed MA. Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models. CTC. 2025;3(2):1-25. https://izlik.org/JA94FR85RZ
Chicago
Ali Shah Zaman, Tabinda, Muzammal Hussain, Saddam Ali, and Muhammad Aqib Fareed. 2025. “Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models”. Current Trends in Computing 3 (2): 1-25. https://izlik.org/JA94FR85RZ.
EndNote
Ali Shah Zaman T, Hussain M, Ali S, Fareed MA (December 1, 2025) Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models. Current Trends in Computing 3 2 1–25.
IEEE
[1]T. Ali Shah Zaman, M. Hussain, S. Ali, and M. A. Fareed, “Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models”, CTC, vol. 3, no. 2, pp. 1–25, Dec. 2025, [Online]. Available: https://izlik.org/JA94FR85RZ
ISNAD
Ali Shah Zaman, Tabinda - Hussain, Muzammal - Ali, Saddam - Fareed, Muhammad Aqib. “Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models”. Current Trends in Computing 3/2 (December 1, 2025): 1-25. https://izlik.org/JA94FR85RZ.
JAMA
1.Ali Shah Zaman T, Hussain M, Ali S, Fareed MA. Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models. CTC. 2025;3:1–25.
MLA
Ali Shah Zaman, Tabinda, et al. “Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models”. Current Trends in Computing, vol. 3, no. 2, Dec. 2025, pp. 1-25, https://izlik.org/JA94FR85RZ.
Vancouver
1.Tabinda Ali Shah Zaman, Muzammal Hussain, Saddam Ali, Muhammad Aqib Fareed. Context-Aware Phishing URL Detection: Harnessing the Power of Large Language Models. CTC [Internet]. 2025 Dec. 1;3(2):1-25. Available from: https://izlik.org/JA94FR85RZ