Research Article
BibTex RIS Cite

Darknet Web Traffic Classification via Gradient Boosting Algorithm

Year 2022, Volume: 14 Issue: 2, 794 - 798, 31.07.2022
https://doi.org/10.29137/umagd.1117634

Abstract

Classification of network traffic not only contributes to improving the quality of network services of institutions, but also helps to protect important data. Machine learning algorithms are frequently used in the classification of network traffic, since port-based and load-based classification processes are insufficient in encrypted networks. In this study, VPN and Tor network traffic combined in the darknet category was classified with the Gradient Boosting Algorithm. 70% of the dataset is reserved for training and 30% for testing. 10 fold cross validation was applied in the training set. Network flows in 8 different categories: Audio-Streaming, Browsing, Chat, E-mail, P2P, File Transfer, Video-Streaming and VOIP were classified with 99.8% accuracy. The proposed method automated the process of network analysis from the darknet. It enabled organizations to protect their important data with high accuracy in a short time.

References

  • Cao, Z., Xiong, G., Zhao, Y., Li, Z., & Guo, L. (2014, November). A survey on encrypted traffic classification. In International Conference on Applications and Techniques in Information Security (pp. 73-81). Springer, Berlin, Heidelberg.
  • Degermark, M., Nordgren, B., & Pink, S. (1999). RFC2507: IP header compression. RFC Editor.
  • Digital 2021: the latest insights into the state of digital - We Are Social UK, We Are Social UK, Jan. 27, 2021. [Online]. Available: https://wearesocial.com/uk/blog/2021/01/digital-2021-the-latest-insights-into-the-state-of-digital/. [Accessed: May 16, 2022]
  • Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189-1232.
  • Habibi Lashkari, A., Kaur, G., & Rahali, A. (2020, November). DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic using Deep Image Learning. In 2020 the 10th International Conference on Communication and Network Security (pp. 1-13).
  • Iliadis, L. A., & Kaifas, T. (2021, July). Darknet Traffic Classification using Machine Learning Techniques. In 2021 10th International Conference on Modern Circuits and Systems Technologies (MOCAST) (pp. 1-4). IEEE.
  • Jun, L., Shunyi, Z., Yanqing, L., & Zailong, Z. (2007, August). Internet traffic classification using machine learning. In 2007 Second International Conference on Communications and Networking in China (pp. 239-243). IEEE.
  • Karagiannis, T., Broido, A., Faloutsos, M., & Claffy, K. C. (2004, October). Transport layer identification of P2P traffic. In Proceedings of the 4th ACM SIGCOMM conference on Internet measurement (pp. 121-134).
  • Kaur, S., & Randhawa, S. (2020). Dark web: A web of crimes. Wireless Personal Communications, 112(4), 2131-2158.
  • Li, Y., & Lu, Y. (2021). ETCC: Encrypted two-label classification using CNN. Security and Communication Networks, 2021, 1-11.
  • Li, Y., Lu, Y., & Li, S. (2021, May). EZAC: Encrypted Zero-day Applications Classification using CNN and K-Means. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD) (pp. 378-383). IEEE.
  • Petit, C., Bezemer, R., & Atallah, L. (2018). A review of recent advances in data analytics for post-operative patient deterioration detection. Journal of clinical monitoring and computing, 32(3), 391-402.
  • Sui, D., Caverlee, J., & Rudesill, D. (2015). The deep web and the darknet. Washington DC: Publication of the Wilson Center.

Gradyan Artırma Algoritması ile Karanlık Ağ Web Trafiği Sınıflandırması

Year 2022, Volume: 14 Issue: 2, 794 - 798, 31.07.2022
https://doi.org/10.29137/umagd.1117634

Abstract

Ağ trafiğinin sınıflandırılması kurumların ağ hizmetlerinin kalitesinin artırılmasına katkı sağlamasının yanında önemli verilerinin korunmasına da yardımcı olmaktadır. Ağ trafiğinin sınıflandırmada, port tabanlı ve yük tabanlı sınıflandırma işlemlerinin şifreli ağlarda yetersiz kalması nedeniyle makine öğrenmesi algoritmaları sıklıkla kullanılmaktadır. Bu çalışmada, Darknet kategorisinde birleştirilen VPN ve Tor ağ trafiği Gradyan Artırma Algoritması ile sınıflandırılmıştır. Veri setinin %70’i eğitim, %30’u test için ayrılmıştır. Eğitim setinde 10 kat çapraz doğrulama uygulanmıştır. 8 farklı kategoride ağ akışları: Ses Akışı, Tarama, Sohbet, E-posta, P2P, Dosya Aktarımı, Video Akışı ve VOIP %99,8 doğrulukla sınıflandırıldı. Önerilen yöntem, karanlık ağdan ağ analizi sürecini otomatikleştirmiştir. Kuruluşların önemli verilerini kısa sürede yüksek doğrulukla korumasını sağlamaktadır.

References

  • Cao, Z., Xiong, G., Zhao, Y., Li, Z., & Guo, L. (2014, November). A survey on encrypted traffic classification. In International Conference on Applications and Techniques in Information Security (pp. 73-81). Springer, Berlin, Heidelberg.
  • Degermark, M., Nordgren, B., & Pink, S. (1999). RFC2507: IP header compression. RFC Editor.
  • Digital 2021: the latest insights into the state of digital - We Are Social UK, We Are Social UK, Jan. 27, 2021. [Online]. Available: https://wearesocial.com/uk/blog/2021/01/digital-2021-the-latest-insights-into-the-state-of-digital/. [Accessed: May 16, 2022]
  • Friedman, J. H. (2001). Greedy function approximation: a gradient boosting machine. Annals of statistics, 1189-1232.
  • Habibi Lashkari, A., Kaur, G., & Rahali, A. (2020, November). DIDarknet: A Contemporary Approach to Detect and Characterize the Darknet Traffic using Deep Image Learning. In 2020 the 10th International Conference on Communication and Network Security (pp. 1-13).
  • Iliadis, L. A., & Kaifas, T. (2021, July). Darknet Traffic Classification using Machine Learning Techniques. In 2021 10th International Conference on Modern Circuits and Systems Technologies (MOCAST) (pp. 1-4). IEEE.
  • Jun, L., Shunyi, Z., Yanqing, L., & Zailong, Z. (2007, August). Internet traffic classification using machine learning. In 2007 Second International Conference on Communications and Networking in China (pp. 239-243). IEEE.
  • Karagiannis, T., Broido, A., Faloutsos, M., & Claffy, K. C. (2004, October). Transport layer identification of P2P traffic. In Proceedings of the 4th ACM SIGCOMM conference on Internet measurement (pp. 121-134).
  • Kaur, S., & Randhawa, S. (2020). Dark web: A web of crimes. Wireless Personal Communications, 112(4), 2131-2158.
  • Li, Y., & Lu, Y. (2021). ETCC: Encrypted two-label classification using CNN. Security and Communication Networks, 2021, 1-11.
  • Li, Y., Lu, Y., & Li, S. (2021, May). EZAC: Encrypted Zero-day Applications Classification using CNN and K-Means. In 2021 IEEE 24th International Conference on Computer Supported Cooperative Work in Design (CSCWD) (pp. 378-383). IEEE.
  • Petit, C., Bezemer, R., & Atallah, L. (2018). A review of recent advances in data analytics for post-operative patient deterioration detection. Journal of clinical monitoring and computing, 32(3), 391-402.
  • Sui, D., Caverlee, J., & Rudesill, D. (2015). The deep web and the darknet. Washington DC: Publication of the Wilson Center.
There are 13 citations in total.

Details

Primary Language English
Subjects Engineering
Journal Section Articles
Authors

Fahrettin Horasan 0000-0003-4554-9083

Ahmet Haşim Yurttakal 0000-0001-5170-6466

Publication Date July 31, 2022
Submission Date May 16, 2022
Published in Issue Year 2022 Volume: 14 Issue: 2

Cite

APA Horasan, F., & Yurttakal, A. H. (2022). Darknet Web Traffic Classification via Gradient Boosting Algorithm. International Journal of Engineering Research and Development, 14(2), 794-798. https://doi.org/10.29137/umagd.1117634

Cited By

INTERNET OF THINGS BOTNET DETECTION VIA ENSEMBLE DEEP NEURAL NETWORKS
International Journal of 3D Printing Technologies and Digital Industry
https://doi.org/10.46519/ij3dptdi.1293277

All Rights Reserved. Kırıkkale University, Faculty of Engineering and Natural Science.