Research Article

Image Based Web Page Classification by Using Deep Learning

Volume: 10 Number: 1 April 30, 2024
TR EN

Image Based Web Page Classification by Using Deep Learning

Abstract

The internet holds a significant role in all aspects of our lives, and its importance continues to grow each day. Therefore, the usability of the Internet holds great significance. Low data quality and disinformation severely impact the usability of the internet. Consequently, people face challenges in obtaining accurate and clear information. In the present day, websites predominantly feature image-based content like pictures and videos, as opposed to text-based content. The classification of such content holds immense importance for search engines. As a result, the classification of web pages stands as a crucial research area for scholars. This study focuses on the classification of image-based web pages. A deep learning-based approach is proposed to categorize web pages into four main groups: tourism, machinery, music, and sports. The suggested method yielded the most favourable outcomes when utilizing the Stochastic Gradient Descent (SGD) optimization method, achieving an accuracy of 0.9737, a recall of 0.9474, an F1 score of 0.9474, and an Area Under the ROC Curve (AUC) value of 0.9649. Furthermore, the utilization of Deep Learning (DL) led to achieving the most advanced results in web page classification within the existing literature, particularly on the WebScreenshots dataset.

Keywords

References

  1. [1] J. McQuillan, I. Richer and E. Rosen, "The New Routing Algorithm for the ARPANET" IEEE Transactions on Communications, vol. 28, no. 5, pp. 711-719, May 1980, https://doi.org/10.1109/TCOM.1980.1094721
  2. [2] C. P. Berges and V. Schafer, “Arpanet (1969–2019),” Internet Histories, vol. 3, no. 1, pp. 1-14, 2019, https://doi.org/10.1080/24701475.2018.1560921
  3. [3] M. T. Simsim, “Internet usage and user preferences in Saudi Arabia,” Journal of King Saud University-Engineering Sciences, vol. 23, no. 2, pp. 101–107, 2011, https://doi.org/10.1016/j.jksues.2011.03.006
  4. [4] A. Weinstein and M. Lejoyeux, “Internet Addiction or Excessive Internet Use,” The American Journal of Drug and Alcohol Abuse, vol. 36, no. 5, pp. 277-283, 2010, https://doi.org/10.3109/00952990.2010.491880
  5. [5] K. Chan and W. Fang, “Use of the internet and traditional media among young people,” Young Consumers, vol. 8, no. 4, pp. 244–256, 2007, https://doi.org/10.1108/17473610710838608
  6. [6] F. Aydos, A. M. Özbayoğlu, Y. Şirin, and M. F. Demirci, “Web page classification with Google Image Search results,” arXiv preprint arXiv:2006.00226, 2020, Available https://arxiv.org/abs/2006.00226
  7. [7] X. Qi and B. D. Davison, “Web page classification: Features and algorithms,” ACM computing surveys (CSUR), vol. 41, no. 2, pp. 1–31, 2009, https://doi.org/10.1145/1459352.1459357
  8. [8] C. Xia and X. Wang, "Graph-Based Web Query Classification," 2015 12th Web Information System and Application Conference, WISA, 11-13 Sept. 2015, Jinan, China [Online]. Available: IEEE Xplore, http://www.ieee.org. [Accessed: 04 February 2016]

Details

Primary Language

English

Subjects

Computer Software

Journal Section

Research Article

Early Pub Date

March 29, 2024

Publication Date

April 30, 2024

Submission Date

October 26, 2023

Acceptance Date

November 22, 2023

Published in Issue

Year 2024 Volume: 10 Number: 1

APA
Yapıcı, M. M. (2024). Image Based Web Page Classification by Using Deep Learning. Gazi Journal of Engineering Sciences, 10(1), 72-83. https://izlik.org/JA46XX47RL
AMA
1.Yapıcı MM. Image Based Web Page Classification by Using Deep Learning. GJES. 2024;10(1):72-83. https://izlik.org/JA46XX47RL
Chicago
Yapıcı, Muhammed Mutlu. 2024. “Image Based Web Page Classification by Using Deep Learning”. Gazi Journal of Engineering Sciences 10 (1): 72-83. https://izlik.org/JA46XX47RL.
EndNote
Yapıcı MM (April 1, 2024) Image Based Web Page Classification by Using Deep Learning. Gazi Journal of Engineering Sciences 10 1 72–83.
IEEE
[1]M. M. Yapıcı, “Image Based Web Page Classification by Using Deep Learning”, GJES, vol. 10, no. 1, pp. 72–83, Apr. 2024, [Online]. Available: https://izlik.org/JA46XX47RL
ISNAD
Yapıcı, Muhammed Mutlu. “Image Based Web Page Classification by Using Deep Learning”. Gazi Journal of Engineering Sciences 10/1 (April 1, 2024): 72-83. https://izlik.org/JA46XX47RL.
JAMA
1.Yapıcı MM. Image Based Web Page Classification by Using Deep Learning. GJES. 2024;10:72–83.
MLA
Yapıcı, Muhammed Mutlu. “Image Based Web Page Classification by Using Deep Learning”. Gazi Journal of Engineering Sciences, vol. 10, no. 1, Apr. 2024, pp. 72-83, https://izlik.org/JA46XX47RL.
Vancouver
1.Muhammed Mutlu Yapıcı. Image Based Web Page Classification by Using Deep Learning. GJES [Internet]. 2024 Apr. 1;10(1):72-83. Available from: https://izlik.org/JA46XX47RL

GJES is indexed and archived by:

3311333114331153311633117

Gazi Journal of Engineering Sciences (GJES) publishes open access articles under a Creative Commons Attribution 4.0 International License (CC BY) 1366_2000-copia-2.jpg