TR
EN
Image Based Web Page Classification by Using Deep Learning
Abstract
The internet holds a significant role in all aspects of our lives, and its importance continues to grow each day. Therefore, the usability of the Internet holds great significance. Low data quality and disinformation severely impact the usability of the internet. Consequently, people face challenges in obtaining accurate and clear information. In the present day, websites predominantly feature image-based content like pictures and videos, as opposed to text-based content. The classification of such content holds immense importance for search engines. As a result, the classification of web pages stands as a crucial research area for scholars. This study focuses on the classification of image-based web pages. A deep learning-based approach is proposed to categorize web pages into four main groups: tourism, machinery, music, and sports. The suggested method yielded the most favourable outcomes when utilizing the Stochastic Gradient Descent (SGD) optimization method, achieving an accuracy of 0.9737, a recall of 0.9474, an F1 score of 0.9474, and an Area Under the ROC Curve (AUC) value of 0.9649. Furthermore, the utilization of Deep Learning (DL) led to achieving the most advanced results in web page classification within the existing literature, particularly on the WebScreenshots dataset.
Keywords
References
- [1] J. McQuillan, I. Richer and E. Rosen, "The New Routing Algorithm for the ARPANET" IEEE Transactions on Communications, vol. 28, no. 5, pp. 711-719, May 1980, https://doi.org/10.1109/TCOM.1980.1094721
- [2] C. P. Berges and V. Schafer, “Arpanet (1969–2019),” Internet Histories, vol. 3, no. 1, pp. 1-14, 2019, https://doi.org/10.1080/24701475.2018.1560921
- [3] M. T. Simsim, “Internet usage and user preferences in Saudi Arabia,” Journal of King Saud University-Engineering Sciences, vol. 23, no. 2, pp. 101–107, 2011, https://doi.org/10.1016/j.jksues.2011.03.006
- [4] A. Weinstein and M. Lejoyeux, “Internet Addiction or Excessive Internet Use,” The American Journal of Drug and Alcohol Abuse, vol. 36, no. 5, pp. 277-283, 2010, https://doi.org/10.3109/00952990.2010.491880
- [5] K. Chan and W. Fang, “Use of the internet and traditional media among young people,” Young Consumers, vol. 8, no. 4, pp. 244–256, 2007, https://doi.org/10.1108/17473610710838608
- [6] F. Aydos, A. M. Özbayoğlu, Y. Şirin, and M. F. Demirci, “Web page classification with Google Image Search results,” arXiv preprint arXiv:2006.00226, 2020, Available https://arxiv.org/abs/2006.00226
- [7] X. Qi and B. D. Davison, “Web page classification: Features and algorithms,” ACM computing surveys (CSUR), vol. 41, no. 2, pp. 1–31, 2009, https://doi.org/10.1145/1459352.1459357
- [8] C. Xia and X. Wang, "Graph-Based Web Query Classification," 2015 12th Web Information System and Application Conference, WISA, 11-13 Sept. 2015, Jinan, China [Online]. Available: IEEE Xplore, http://www.ieee.org. [Accessed: 04 February 2016]
Details
Primary Language
English
Subjects
Computer Software
Journal Section
Research Article
Authors
Early Pub Date
March 29, 2024
Publication Date
April 30, 2024
Submission Date
October 26, 2023
Acceptance Date
November 22, 2023
Published in Issue
Year 2024 Volume: 10 Number: 1
APA
Yapıcı, M. M. (2024). Image Based Web Page Classification by Using Deep Learning. Gazi Journal of Engineering Sciences, 10(1), 72-83. https://izlik.org/JA46XX47RL
AMA
1.Yapıcı MM. Image Based Web Page Classification by Using Deep Learning. GJES. 2024;10(1):72-83. https://izlik.org/JA46XX47RL
Chicago
Yapıcı, Muhammed Mutlu. 2024. “Image Based Web Page Classification by Using Deep Learning”. Gazi Journal of Engineering Sciences 10 (1): 72-83. https://izlik.org/JA46XX47RL.
EndNote
Yapıcı MM (April 1, 2024) Image Based Web Page Classification by Using Deep Learning. Gazi Journal of Engineering Sciences 10 1 72–83.
IEEE
[1]M. M. Yapıcı, “Image Based Web Page Classification by Using Deep Learning”, GJES, vol. 10, no. 1, pp. 72–83, Apr. 2024, [Online]. Available: https://izlik.org/JA46XX47RL
ISNAD
Yapıcı, Muhammed Mutlu. “Image Based Web Page Classification by Using Deep Learning”. Gazi Journal of Engineering Sciences 10/1 (April 1, 2024): 72-83. https://izlik.org/JA46XX47RL.
JAMA
1.Yapıcı MM. Image Based Web Page Classification by Using Deep Learning. GJES. 2024;10:72–83.
MLA
Yapıcı, Muhammed Mutlu. “Image Based Web Page Classification by Using Deep Learning”. Gazi Journal of Engineering Sciences, vol. 10, no. 1, Apr. 2024, pp. 72-83, https://izlik.org/JA46XX47RL.
Vancouver
1.Muhammed Mutlu Yapıcı. Image Based Web Page Classification by Using Deep Learning. GJES [Internet]. 2024 Apr. 1;10(1):72-83. Available from: https://izlik.org/JA46XX47RL
