Brand Recognition of Phishing Web Pages via Global Image Descriptors
Abstract
Phishing attacks, which have exponentially increased in recent years, are a form of cyber attack aiming to steal sensitive credentials of innocent users. In general, the attackers attempt to deceive users by creating and submitting a fake but visually similar version of a legitimate web page, which has already been in usage. In this study, we suggest an approach for recognition of phishing web pages by utilizing two global image descriptors namely GIST and local binary patterns (LBP) which have never been employed in phishing web page recognition literature. Moreover, in order to obtain a discriminative representation, we have experimented two kinds of visual feature extraction scheme such as (1) “holistic” and (2) “multi-level patches”. While we have only used whole web page screenshot in “holistic” scheme, screenshots were divided into equally sized smaller crops at growing number of levels during the implementation of “multi-level” patches scheme. In order to evaluate the proposed approach, we have employed a publicly available phishing web page dataset in literature including screenshots of both 14 different highly phished brands and legitimate web pages posing an open-set problem for researchers. Besides, the aforementioned dataset covers 1313 training and 1539 testing cases in total. The visual signatures extracted by use of GIST and LBP descriptors were then fed to various machine learning models such as SVM, Random Forest and XGBoost (regularized gradient tree boosting). According to the results of comprehensively conducted experiments, XGBoost has been found as the best learner. In line with this finding, we obtained 87.7% (GIST) and 83.1% (LBP) validation accuracy along with the representation of “multi-level patches”. Consequently, it has been shown that preferred global image descriptors can be successfully employed for detecting and recognizing phishing web pages. In addition, average required time for processing one screenshot (around 1.12 sec.) with GIST descriptors indicates that the proposed scheme and GIST can be effectively used as a browser based plug-in for recognizing brands of phishing web pages.
Keywords
References
- Jain, A. K., & Gupta, B. B. (2017). Phishing detection: analysis of visual similarity based approaches. Security and Communication Networks, 2017.
- Phishing Activity Trends Report 1st Quarter 2019, www.apwg.org • info@apwg.org
- Basnet, R. B., & Sung, A. H. (2014). Learning to Detect Phishing Webpages. J. Internet Serv. Inf. Secur., 4(3), 21-39.
- Ali, W. (2017). Phishing Website Detection based on Supervised Machine Learning with Wrapper Features Selection. International Journal of Advanced Computer Science and Applications, 8(9), 72-78.
- Zhang, W., Lu, H., Xu, B., & Yang, H. (2013). Web phishing detection based on page spatial layout similarity. Informatica, 37(3).
- Rao, R. S., & Ali, S. T. (2015, April). A computer vision technique to detect phishing attacks. In 2015 Fifth International Conference on Communication Systems and Network Technologies (pp. 596-601). IEEE.
- Prakash, P., Kumar, M., Kompella, R. R., & Gupta, M. (2010, March). Phishnet: predictive blacklisting to detect phishing attacks. In 2010 Proceedings IEEE INFOCOM (pp. 1-5). IEEE.
- Hara, M., Yamada, A., & Miyake, Y. (2009, March). Visual similarity-based phishing detection without victim site information. In 2009 IEEE Symposium on Computational Intelligence in Cyber Security (pp. 30-36). IEEE.
Details
Primary Language
English
Subjects
Engineering
Journal Section
Research Article
Authors
Esra Eroğlu
*
This is me
0000-0002-6140-6894
Ahmet Selman Bozkır
This is me
0000-0003-4305-7800
Murat Aydos
This is me
0000-0002-7570-9204
Publication Date
October 31, 2019
Submission Date
August 1, 2019
Acceptance Date
October 25, 2019
Published in Issue
Year 2019