Research Article
BibTex RIS Cite

Comparison of Classification Performances of Imbalanced Ml-Based Nıds Datasets

Year 2022, Issue: 41, 349 - 356, 30.11.2022
https://doi.org/10.31590/ejosat.1157441

Abstract

Network Based Intrusion Detection Systems (NIDS) are used to track and analyze traffic from all devices on the network. Nowadays Machine Learning (ML) based NIDS is one of the important tools to protect computer networks against cyber attacks. Network data characteristics have a significant impact for training and evaluation of ML-based NIDS. Therefore, to evaluate the accuracy and performance of the ML model, multiple datasets must contain a common core set of features. In this study, binary classification was performed using NIDS datasets (NF-UNSW-NB15, NF-BoT-IoT, NF-ToN-IoT and NF-CSE-CIC-IDS2018) with common NetFlow features. The attack and benign classes in the datasets show an unbalanced distribution. To overcome this, the Random Undersampling method was used. Random Forest, K-Nearest Neighbors, Support Vector Machines and Artificial Neural Networks were used as classification methods. The accuracy and performance of different datasets were compared to the resampled cases using ML methods. As a result of the study, the Random Forest algorithm gave the best result for all four data sets.

References

  • Referans1 Ahmad, I., Basheri, M., Iqbal, M. J., & Rahim, A. (2018). Performance comparison of support vector machine, random forest, and extreme learning machine for intrusion detection. IEEE access, 6, 33789-33795. DOI: 10.1109/ACCESS.2018.2841987
  • Referans2 Akhan Baykan, N. & Khorram, T. (2021). Network Intrusion Detection using Optimized Machine Learning Algorithms . Avrupa Bilim ve Teknoloji Dergisi , (25) , 463-474 . DOI: 10.31590/ejosat.849723
  • Referans3 Alrashdi, I., Alqazzaz, A., Aloufi, E., Alharthi, R., Zohdy, M., & Ming, H. (2019, January). Ad-iot: Anomaly detection of iot cyberattacks in smart city using machine learning. In 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0305-0310). IEEE. DOI: 10.1109/CCWC.2019.8666450
  • Referans4 Apruzzese, G., Colajanni, M., Ferretti, L., Guido, A., & Marchetti, M. (2018, May). On the effectiveness of machine and deep learning for cyber security. In 2018 10th international conference on cyber Conflict (CyCon) (pp. 371-390). IEEE. DOI: 10.23919/CYCON.2018.8405026
  • Referans 5 Bamakan, S. M. H., Wang, H., Yingjie, T., & Shi, Y. (2016). An effective intrusion detection framework based on MCLP/SVM optimized by time-varying chaos particle swarm optimization. Neurocomputing, 199, 90-102. https://doi.org/10.1016/j.neucom.2016.03.031
  • Referans6 Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS journal of photogrammetry and remote sensing, 114, 24-31. https://doi.org/10.1016/j.isprsjprs.2016.01.011
  • Referans 7 Buczak, A. L., & Guven, E. (2015). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications surveys & tutorials, 18(2), 1153-1176. DOI: 10.1109/COMST.2015.2494502
  • Referans 8 Claise, B. (2004). Cisco systems netflow services export version 9 (No. rfc3954).
  • Referans9 Çakır, B. & Angın, P. (2021). Zamansal Evrişimli Ağlarla Saldırı Tespiti: Karşılaştırmalı Bir Analiz . Avrupa Bilim ve Teknoloji Dergisi , Ejosat 2021 Ocak , 204-211 . DOI: 10.31590/ejosat.848784
  • Referans10 Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., & Vázquez, E. (2009). Anomaly-based network intrusion detection: Techniques, systems and challenges. computers & security, 28(1-2), 18-28. doi:10.1016/j.cose.2008.08.003
  • Referans11 Garuba, M., Liu, C., & Fraites, D. (2008, April). Intrusion techniques: Comparative study of network intrusion detection systems. In Fifth International Conference on Information Technology: New Generations (itng 2008) (pp. 592-598). IEEE. DOI: 10.1109/ITNG.2008.231
  • Referans12 Ghahramani, Z. (2015). Probabilistic machine learning and artificial intelligence. Nature, 521(7553), 452-459. https://doi.org/10.1038/nature14541
  • Referans13 Jing, D., & Chen, H. B. (2019, October). SVM based network intrusion detection for the UNSW-NB15 dataset. In 2019 IEEE 13th international conference on ASIC (ASICON) (pp. 1-4). IEEE. DOI: 10.1109/ASICON47005.2019.8983598
  • Referans14 Karatas, G., Demir, O., & Sahingoz, O. K. (2020). Increasing the performance of machine learning-based IDSs on an imbalanced and up-to-date dataset. IEEE Access, 8, 32150-32162. DOI: 10.1109/ACCESS.2020.2973219
  • Referans15 Kerr DR, Bruins B L, (2021) U.S. Patent No. 6,243,667. Washington, DC: U.S. Patent and Trademark Office.
  • Referans16 Kuş, İ. , Bozkurt Keser, S. & Yolaçan, E. (2021). Saldırı Tespit Sistemlerinde Topluluk Öğrenme Yöntemlerinin Kıyaslanması . Avrupa Bilim ve Teknoloji Dergisi , Ejosat 2021 Supplement 1 , 725-734 . DOI: 10.31590/ejosat.971875
  • Referans17 Meftah, S., Rachidi, T., & Assem, N. (2019). Network based intrusion detection using the UNSW-NB15 dataset. International Journal of Computing and Digital Systems, 8(5), 478-487. DOI: http://dx.doi.org/10.12785/ijcds/080505
  • Referans18 Sarhan, M., Layeghy, S., & Portmann, M. (2021). Evaluating Standard Feature Sets Towards Increased Generalisability and Explainability of ML-based Network Intrusion Detection. arXiv preprint arXiv:2104.07183. https://doi.org/10.48550/arXiv.2104.07183
  • Referans19 Sarhan, M., Layeghy, S., & Portmann, M. (2022). Towards a standard feature set for network intrusion detection system datasets. Mobile Networks and Applications, 27(1), 357-370. https://doi.org/10.1007/s11036-021-01843-0
  • Referans20 Sarhan, M., Layeghy, S., Gallagher, M., & Portmann, M. (2021). From Zero-Shot Machine Learning to Zero-Day Attack Detection. arXiv preprint arXiv:2109.14868. https://doi.org/10.48550/arXiv.2109.14868
  • Referans21 Sarhan, M., Layeghy, S., Moustafa, N., & Portmann, M. (2020). Netflow datasets for machine learning-based network intrusion detection systems. In Big Data Technologies and Applications (pp. 117-135). Springer, Cham. DOI: 10.1007/978-3-030-72802-1_9
  • Referans22 Sinclair, C., Pierce, L., & Matzner, S. (1999, December). An application of machine learning to network intrusion detection. In Proceedings 15th annual computer security applications conference (ACSAC'99) (pp. 371-377). IEEE. DOI: 10.1109/CSAC.1999.816048
  • Referans23 Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information processing & management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
  • Referans24 Söderström, A. (2021). Anomaly-based Intrusion Detection Using Convolutional Neural Networks for IoT Devices. MSc Thesis, Blekinge Institute of Technology, Karlskrona, Sweden.
  • Referans25 Wang, C., Wang, B., Sun, Y., Wei, Y., Wang, K., Zhang, H., & Liu, H. (2021). Intrusion Detection for Industrial Control Systems Based on Open Set Artificial Neural Network. Security and Communication Networks, 2021. https://doi.org/10.1155/2021/4027900
  • Referans26 Zhang, S., Li, X., Zong, M., Zhu, X., & Wang, R. (2017). Efficient kNN classification with different numbers of nearest neighbors. IEEE transactions on neural networks and learning systems, 29(5), 1774-1785. DOI: 10.1109/TNNLS.2017.2673241

Dengesiz Ml-Tabanlı Nıds Veri Setlerinin Sınıflandırma Performanslarının Karşılaştırılması

Year 2022, Issue: 41, 349 - 356, 30.11.2022
https://doi.org/10.31590/ejosat.1157441

Abstract

Ağ tabanlı Saldırı Tespit Sistemleri (NIDS), ağda bulunan tüm cihazlardan gelen trafiği izlemek ve analiz etmek için kullanılır. Makine Öğrenimi (ML) tabanlı NIDS, günümüzde bilgisayar ağlarını siber saldırılara karşı korumak için önemli araçlardan biridir. ML tabanlı NIDS'in eğitimi ve değerlendirilmesi için ağ veri özellikleri önemli bir etkiye sahiptir. Bu nedenle ML modelinin doğruluğunu ve performansını değerlendirmek için birden çok veri kümesinin ortak temel özellik kümesi içermesi gerekir. Bu çalışmada ortak NetFlow özelliklerine sahip NIDS veri setleri (NF-UNSW-NB15, NF-BoT-IoT, NF-ToN-IoT ve NF-CSE-CIC-IDS2018) kullanılarak ikili sınıflandırma yapılmıştır. Veri setlerindeki saldırı ve normal akış (saldırı yok) sınıfları dengesiz dağılım göstermektedir. Bunun üstesinden gelmek için Rastgele Alt Örnekleme yöntemi kullanılmıştır. Sınıflandırma yöntemleri olarak Rastgele Orman, K-En Yakın Komşuluk, Destek Vektör Makineleri ve Yapay Sinir Ağları algoritmaları kullanılmıştır. Farklı veri setlerinin yeniden örneklenmiş durumlarına, ML yöntemleri kullanılarak doğruluk ve performansları karşılaştırılmıştır. Bu çalışma kapsamında kullanılmış olan dört veri seti içinde en iyi sonucu Rastgele Orman algoritması vermiştir.

References

  • Referans1 Ahmad, I., Basheri, M., Iqbal, M. J., & Rahim, A. (2018). Performance comparison of support vector machine, random forest, and extreme learning machine for intrusion detection. IEEE access, 6, 33789-33795. DOI: 10.1109/ACCESS.2018.2841987
  • Referans2 Akhan Baykan, N. & Khorram, T. (2021). Network Intrusion Detection using Optimized Machine Learning Algorithms . Avrupa Bilim ve Teknoloji Dergisi , (25) , 463-474 . DOI: 10.31590/ejosat.849723
  • Referans3 Alrashdi, I., Alqazzaz, A., Aloufi, E., Alharthi, R., Zohdy, M., & Ming, H. (2019, January). Ad-iot: Anomaly detection of iot cyberattacks in smart city using machine learning. In 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC) (pp. 0305-0310). IEEE. DOI: 10.1109/CCWC.2019.8666450
  • Referans4 Apruzzese, G., Colajanni, M., Ferretti, L., Guido, A., & Marchetti, M. (2018, May). On the effectiveness of machine and deep learning for cyber security. In 2018 10th international conference on cyber Conflict (CyCon) (pp. 371-390). IEEE. DOI: 10.23919/CYCON.2018.8405026
  • Referans 5 Bamakan, S. M. H., Wang, H., Yingjie, T., & Shi, Y. (2016). An effective intrusion detection framework based on MCLP/SVM optimized by time-varying chaos particle swarm optimization. Neurocomputing, 199, 90-102. https://doi.org/10.1016/j.neucom.2016.03.031
  • Referans6 Belgiu, M., & Drăguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS journal of photogrammetry and remote sensing, 114, 24-31. https://doi.org/10.1016/j.isprsjprs.2016.01.011
  • Referans 7 Buczak, A. L., & Guven, E. (2015). A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Communications surveys & tutorials, 18(2), 1153-1176. DOI: 10.1109/COMST.2015.2494502
  • Referans 8 Claise, B. (2004). Cisco systems netflow services export version 9 (No. rfc3954).
  • Referans9 Çakır, B. & Angın, P. (2021). Zamansal Evrişimli Ağlarla Saldırı Tespiti: Karşılaştırmalı Bir Analiz . Avrupa Bilim ve Teknoloji Dergisi , Ejosat 2021 Ocak , 204-211 . DOI: 10.31590/ejosat.848784
  • Referans10 Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., & Vázquez, E. (2009). Anomaly-based network intrusion detection: Techniques, systems and challenges. computers & security, 28(1-2), 18-28. doi:10.1016/j.cose.2008.08.003
  • Referans11 Garuba, M., Liu, C., & Fraites, D. (2008, April). Intrusion techniques: Comparative study of network intrusion detection systems. In Fifth International Conference on Information Technology: New Generations (itng 2008) (pp. 592-598). IEEE. DOI: 10.1109/ITNG.2008.231
  • Referans12 Ghahramani, Z. (2015). Probabilistic machine learning and artificial intelligence. Nature, 521(7553), 452-459. https://doi.org/10.1038/nature14541
  • Referans13 Jing, D., & Chen, H. B. (2019, October). SVM based network intrusion detection for the UNSW-NB15 dataset. In 2019 IEEE 13th international conference on ASIC (ASICON) (pp. 1-4). IEEE. DOI: 10.1109/ASICON47005.2019.8983598
  • Referans14 Karatas, G., Demir, O., & Sahingoz, O. K. (2020). Increasing the performance of machine learning-based IDSs on an imbalanced and up-to-date dataset. IEEE Access, 8, 32150-32162. DOI: 10.1109/ACCESS.2020.2973219
  • Referans15 Kerr DR, Bruins B L, (2021) U.S. Patent No. 6,243,667. Washington, DC: U.S. Patent and Trademark Office.
  • Referans16 Kuş, İ. , Bozkurt Keser, S. & Yolaçan, E. (2021). Saldırı Tespit Sistemlerinde Topluluk Öğrenme Yöntemlerinin Kıyaslanması . Avrupa Bilim ve Teknoloji Dergisi , Ejosat 2021 Supplement 1 , 725-734 . DOI: 10.31590/ejosat.971875
  • Referans17 Meftah, S., Rachidi, T., & Assem, N. (2019). Network based intrusion detection using the UNSW-NB15 dataset. International Journal of Computing and Digital Systems, 8(5), 478-487. DOI: http://dx.doi.org/10.12785/ijcds/080505
  • Referans18 Sarhan, M., Layeghy, S., & Portmann, M. (2021). Evaluating Standard Feature Sets Towards Increased Generalisability and Explainability of ML-based Network Intrusion Detection. arXiv preprint arXiv:2104.07183. https://doi.org/10.48550/arXiv.2104.07183
  • Referans19 Sarhan, M., Layeghy, S., & Portmann, M. (2022). Towards a standard feature set for network intrusion detection system datasets. Mobile Networks and Applications, 27(1), 357-370. https://doi.org/10.1007/s11036-021-01843-0
  • Referans20 Sarhan, M., Layeghy, S., Gallagher, M., & Portmann, M. (2021). From Zero-Shot Machine Learning to Zero-Day Attack Detection. arXiv preprint arXiv:2109.14868. https://doi.org/10.48550/arXiv.2109.14868
  • Referans21 Sarhan, M., Layeghy, S., Moustafa, N., & Portmann, M. (2020). Netflow datasets for machine learning-based network intrusion detection systems. In Big Data Technologies and Applications (pp. 117-135). Springer, Cham. DOI: 10.1007/978-3-030-72802-1_9
  • Referans22 Sinclair, C., Pierce, L., & Matzner, S. (1999, December). An application of machine learning to network intrusion detection. In Proceedings 15th annual computer security applications conference (ACSAC'99) (pp. 371-377). IEEE. DOI: 10.1109/CSAC.1999.816048
  • Referans23 Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information processing & management, 45(4), 427-437. https://doi.org/10.1016/j.ipm.2009.03.002
  • Referans24 Söderström, A. (2021). Anomaly-based Intrusion Detection Using Convolutional Neural Networks for IoT Devices. MSc Thesis, Blekinge Institute of Technology, Karlskrona, Sweden.
  • Referans25 Wang, C., Wang, B., Sun, Y., Wei, Y., Wang, K., Zhang, H., & Liu, H. (2021). Intrusion Detection for Industrial Control Systems Based on Open Set Artificial Neural Network. Security and Communication Networks, 2021. https://doi.org/10.1155/2021/4027900
  • Referans26 Zhang, S., Li, X., Zong, M., Zhu, X., & Wang, R. (2017). Efficient kNN classification with different numbers of nearest neighbors. IEEE transactions on neural networks and learning systems, 29(5), 1774-1785. DOI: 10.1109/TNNLS.2017.2673241
There are 26 citations in total.

Details

Primary Language Turkish
Subjects Engineering
Journal Section Articles
Authors

Emine Cengiz 0000-0002-6695-9500

Güneş Harman 0000-0001-5413-124X

Early Pub Date October 2, 2022
Publication Date November 30, 2022
Published in Issue Year 2022 Issue: 41

Cite

APA Cengiz, E., & Harman, G. (2022). Dengesiz Ml-Tabanlı Nıds Veri Setlerinin Sınıflandırma Performanslarının Karşılaştırılması. Avrupa Bilim Ve Teknoloji Dergisi(41), 349-356. https://doi.org/10.31590/ejosat.1157441