Yıl 2019, Cilt 7 , Sayı 2, Sayfalar 205 - 216 2019-05-25

The Performance Analysis of Data Mining Algorithms for Anomaly Detection
Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi

Ünal Çavuşoğlu [1] , Sezgin Kaçar [2]


In this study, performance evaluation of data mining algorithms used to detect harmful computer network traffic is realized. Firstly, feature selection process is performed from the NSL-KDD dataset with different feature selection algorithms. As a result of this process, different datasets are created by combining different attributes. Performans tests are conducted for the detection of anormal traffic using different data mining algorithms on these data sets. As a result of the tests, performance evaluation of different data mining and feature selection algorithms is presented.

Bu çalışmada, bilgisayar ağ trafiğinde tehlike oluşturabilecek zararlı trafiğin tespit edilmesi için kullanılan veri madenciliği algoritmalarının performans değerlendirilmesi gerçekleştirilmiştir. İlk olarak, farklı özellik çıkarım algoritmaları ile NSL-KDD veri setinden nitelik çıkarım işlemi gerçekleştirilmiştir. Bu işlem sonucunda farklı niteliklerden oluşan yeni veri setleri oluşturulmuştur. Bu veri setleri üzerinde farklı veri madenciliği algoritmaları kullanılarak anormal trafik tespiti için testler yapılmıştır. Yapılan testler sonucunda, farklı veri madenciliği ve özellik çıkarım algoritmalarının performans değerlendirmesi sunulmuştur.

  • [1] R. Deng, P. Zhuang, and H. Liang, “CCPA: Coordinated Cyber-Physical Attacks and Countermeasures in Smart Grid,” IEEE Transactions on Smart Grid, vol. 8, no. 5, pp. 2420–2430, 2017.
  • [2] S. Wang, A. Zhou, M. Yang, L. Sun, C.-H. Hsu, and F. Yang, “Service Composition in Cyber-Physical-Social Systems,” IEEE Transactions on Emerging Topics in Computing, pp. 1–1, 2017.
  • [3] L. Qi, W. Dou, Y. Zhou, J. Yu, and C. Hu, “A Context-aware Service Evaluation Approach over Big Data for Cloud Applications,” IEEE Transactions on Cloud Computing, pp. 1–1, 2015.
  • [4] D. Denning, “An Intrusion-Detection Model,” IEEE Transactions on Software Engineering, vol. SE-13, no. 2, pp. 222–232, 1987.
  • [5] A. Milenkoski, M. Vieira, S. Kounev, A. Avritzer, and B. D. Payne, “Evaluating Computer Intrusion Detection Systems,” ACM Computing Surveys, vol. 48, no. 1, pp. 1–41, 2015.
  • [6] V. Chandola, A. Banerjee, and V. Kumar, “Anomaly detection,” ACM Computing Surveys, vol. 41, no. 3, pp. 1–58, Jan. 2009.
  • [7] K. Julisch, “Data Mining for Intrusion Detection,” Advances in Information Security Applications of Data Mining in Computer Security, pp. 33–62, 2002.
  • [8] W. Lee, S. Stolfo, and K. Mok, “A data mining framework for building intrusion detection models,” Proceedings of the 1999 IEEE Symposium on Security and Privacy (Cat. No.99CB36344).
  • [9] N. B. Amor, S. Benferhat, and Z. Elouedi, “Naive Bayes vs decision trees in intrusion detection systems,” Proceedings of the 2004 ACM symposium on Applied computing - SAC 04, 2004.
  • [10] M. Blowers and J. Williams, “Machine Learning Applied to Cyber Operations,” Advances in Information Security Network Science and Cybersecurity, pp. 155–175, 2013.
  • [11] G. Nadiammai and M. Hemalatha, “Effective approach toward Intrusion Detection System using data mining techniques,” Egyptian Informatics Journal, vol. 15, no. 1, pp. 37–50, 2014.
  • [12] J. Patel, K. Panchal, “Effective Intrusion Detection System using Data Mining Technique”, Journal of Emerging Technologies and Innovative Research (JETIR), Vol. 2, no. 6, pp 1869- 1878, 2015.
  • [13] Y. Li, J. Xia, S. Zhang, J. Yan, X. Ai, and K. Dai, “An efficient intrusion detection system based on support vector machines and gradually feature removal method,” Expert Systems with Applications, vol. 39, no. 1, pp. 424–430, 2012.
  • [14] S.-J. Horng, M.-Y. Su, Y.-H. Chen, T.-W. Kao, R.-J. Chen, J.-L. Lai, and C. D. Perkasa, “A novel intrusion detection system based on hierarchical clustering and support vector machines,” Expert Systems with Applications, vol. 38, no. 1, pp. 306–313, 2011.
  • [15] W.-C. Lin, S.-W. Ke, and C.-F. Tsai, “CANN: An intrusion detection system based on combining cluster centers and nearest neighbors,” Knowledge-Based Systems, vol. 78, pp. 13–21, 2015.
  • [16] C.-B. Jiang, I.-H. Liu, Y.-N. Chung, and J.-S. Li, “Novel intrusion prediction mechanism based on honeypot log similarity,” International Journal of Network Management, vol. 26, no. 3, pp. 156–175, Dec. 2016.
  • [17] B. K. Kumar, A. Bhaskar, “Identifying Network Anomalies Using Clustering Technique in Weblog Data”, International Journal of Computers & Technology, Vol. 2 No. 3,2012.
  • [18] V. Sharma and A. Nema, “Innovative Genetic Approach for Intrusion Detection by Using Decision Tree,” 2013 International Conference on Communication Systems and Network Technologies, 2013.
  • [19] A. Ashoor, S. Gore , “Intrusion Detection System (IDS): Case Study”, International Conference on Advanced Materials Engineering,2011.
  • [20] J. Stenico, L. Ling, “Network Traffic Monitoring and Analysis”, The State of the Art in Intrusion Prevention and Detection, 23-46, 2014.
  • [21] The KDD CUP 1999 Data. 1999, http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html (Erişim zamanı; Nisan, 2, 2018)
  • [22] NSL-KDD, http:// unb.ca/cic/datasets/nsl.html (Erişim zamanı; Nisan, 3, 2018)
  • [23] M. Hall, E. Frank, J. Holmes, B. Pfahringer, P. Reutemann, and I. Witten, “The WEKA data mining software: An update,” ACM SIGKDD Explor. Newslett., vol. 11, no. 1, pp. 10–18, 2009.
  • [24] R Language Definition, R Core Team, ftp://155.232.191.133/cran/doc/manuals/r-devel/R-lang.pdf (Erişim zamanı; Nisan, 3, 2018)
  • [25] M. Graczyk, T. Lasota, and B. Trawiński, “Comparative Analysis of Premises Valuation Models Using KEEL, RapidMiner, and WEKA,” Computational Collective Intelligence. Semantic Web, Social Networks and Multiagent Systems Lecture Notes in Computer Science, pp. 800–812, 2009.
  • [26] M. A. Hall, L. A. Smith, “Practical feature subset selection for machine learning”. In C. McDonald(Ed.), Computer Science ’98 Proceedings, pp. 181-191,1998.
  • [27] H,Almuallim, T. G. Dietterich, “Efficient algorithms for identifying relevant features”. In Proc. of the 9th Canadian Conference on Artificial Intelligence, pp. 38-45, 1991.
  • [28] K. Kenji, L. A. Rendell. "The feature selection problem: Traditional methods and a new algorithm." AAAI'92 Proceedings of the tenth national conference on Artificial intelligence, pp. 129-134. 1992.
  • [29] G. H. John, R. Kohavi, and K. Pfleger, “Irrelevant Features and the Subset Selection Problem,” Machine Learning Proceedings 1994, pp. 121–129, 1994.
  • [30] R. Kohavi and G. H. John, “Wrappers for feature subset selection,” Artificial Intelligence, vol. 97, no. 1-2, pp. 273–324, 1997.
  • [31] D. Sanmay. "Filters, wrappers and a boosting-based hybrid for feature selection." In ICML, vol. 1, pp. 74-81. 2001.
  • [32] M. Dash, K. Choi, P. Scheuermann, and H. Liu, “Feature selection for clustering - a filter solution,” 2002 IEEE International Conference on Data Mining, 2002. Proceedings.
  • [33] T. S. Chou, K. K. Yen, and J. Luo,”Network Intrusion Detection Design Using Feature Selection of Soft Computing Paradigms”, International Journal of Computational Intelligence, vol. 4, no. 3, pp.196-208,2008.
  • [34] K. Selvakuberan, M. Indradevi, Dr. R. Rajaram “Combined Feature Selection and classification – A novel approach for the categorization of web pages”, Journal of Information and Computing Science, Vol. 3, No. 2, pp. 83-89, 2008.
  • [35] H. Abdi and L. J. Williams, “Principal component analysis,” Wiley Interdisciplinary Reviews: Computational Statistics, vol. 2, no. 4, pp. 433–459, 2010.
  • [36] T. Metsalu and J. Vilo, “ClustVis: a web tool for visualizing clustering of multivariate data using Principal Component Analysis and heatmap,” Nucleic Acids Research, vol. 43, no. W1, Dec. 2015.
  • [37] S. L. Scott, “A Bayesian paradigm for designing intrusion detection systems,” Computational Statistics & Data Analysis, vol. 45, no. 1, pp. 69–83, 2004.
  • [38] D. Mladenic, M. Grobelnik, “Feature selection for unbalanced class distribution and naive bayes”, In ICML Vol. 99, pp. 258-267,1999.
  • [39] K. Alsubhi, I. Aib, and R. Boutaba, “FuzMet: a fuzzy-logic based alert prioritization engine for intrusion detection systems,” International Journal of Network Management, vol. 22, no. 4, pp. 263–284, 2011.
  • [40] R. Quinlan, C4.5: Programs for Machine Learning. San Mateo, CA, USA: Morgan Kaufmann, 1993.
  • [41] J. Cannady, “Artificial neural networks for misuse detection”, In: Proceedings of the National Information Systems Security Conference; 368-381,1998.
  • [42] Z. Zhang and H. Shen, “Application of online-training SVMs for real-time intrusion detection with different considerations,” Computer Communications, vol. 28, no. 12, pp. 1428–1442, 2005.
  • [43] T. Denœux, “A k-Nearest Neighbor Classification Rule Based on Dempster-Shafer Theory,” IEEE transactions on systems, man, and cybernetics, vol. 25, no.5, pp. 804-813, 1995.
  • [44] J. A. Hartigan, M. A. Wong, “A k-means clustering algorithm. Journal of the Royal Statistical Society”, Series C (Applied Statistics), vol. 28, no. 1, pp.100-108, 1979.
  • [45] E. Alpaydin, “Introduction to machine learning,”MIT Press, 2004.
  • [46] J. Han, and M. Kamber, “Data mining: concepts and techniques’” (2nd ed.). Morgan Kaufmann Publishers, 2006.
  • [47] Y. Liao and V. Vemuri, “Use of K-Nearest Neighbor classifier for intrusion detection,” Computers & Security, vol. 21, no. 5, pp. 439–448, 2002.
  • [48] N. Japkowicz, M. Shah, “Evaluating learning algorithms: a classification perspective”, Cambridge University Press, 2011.
  • [49] T. R. Patil, S. S. Sherekar, “Performance analysis of Naive Bayes and J48 classification algorithm for data classification”, International Journal of Computer Science and Applications, vol. 6, no. 2, pp.256-261, 2013.
  • [50] X. Deng, Q. Liu, Y. Deng, and S. Mahadevan, “An improved method to construct basic probability assignment based on the confusion matrix for classification problem,” Information Sciences, vol. 340-341, pp. 250–261, 2016.
  • [51] Y. Liu, J. Cheng, C. Yan, X. Wu, and F. Chen, “Research on the Matthews Correlation Coefficients Metrics of Personalized Recommendation Algorithm Evaluation,” International Journal of Hybrid Information Technology, vol. 8, no. 1, pp. 163–172, 2015.
  • [52] J. A. Swets, “ROC Analysis Applied to the Evaluation of Medical Imaging echniques,” Investigative Radiology, vol. 14, no. 2, pp. 109–121, 1979.
Birincil Dil tr
Konular Mühendislik
Bölüm Makaleler
Yazarlar

Yazar: Ünal Çavuşoğlu (Sorumlu Yazar)
Kurum: SAKARYA ÜNİVERSİTESİ
Ülke: Turkey


Yazar: Sezgin Kaçar
Kurum: SAKARYA ÜNİVERSİTESİ
Ülke: Turkey


Tarihler

Yayımlanma Tarihi : 25 Mayıs 2019

Bibtex @araştırma makalesi { apjes418519, journal = {Akademik Platform Mühendislik ve Fen Bilimleri Dergisi}, issn = {}, eissn = {2147-4575}, address = {}, publisher = {Akademik Platform}, year = {2019}, volume = {7}, pages = {205 - 216}, doi = {10.21541/apjes.418519}, title = {Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi}, key = {cite}, author = {Çavuşoğlu, Ünal and Kaçar, Sezgin} }
APA Çavuşoğlu, Ü , Kaçar, S . (2019). Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi. Akademik Platform Mühendislik ve Fen Bilimleri Dergisi , 7 (2) , 205-216 . DOI: 10.21541/apjes.418519
MLA Çavuşoğlu, Ü , Kaçar, S . "Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi". Akademik Platform Mühendislik ve Fen Bilimleri Dergisi 7 (2019 ): 205-216 <https://dergipark.org.tr/tr/pub/apjes/issue/40960/418519>
Chicago Çavuşoğlu, Ü , Kaçar, S . "Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi". Akademik Platform Mühendislik ve Fen Bilimleri Dergisi 7 (2019 ): 205-216
RIS TY - JOUR T1 - Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi AU - Ünal Çavuşoğlu , Sezgin Kaçar Y1 - 2019 PY - 2019 N1 - doi: 10.21541/apjes.418519 DO - 10.21541/apjes.418519 T2 - Akademik Platform Mühendislik ve Fen Bilimleri Dergisi JF - Journal JO - JOR SP - 205 EP - 216 VL - 7 IS - 2 SN - -2147-4575 M3 - doi: 10.21541/apjes.418519 UR - https://doi.org/10.21541/apjes.418519 Y2 - 2018 ER -
EndNote %0 Akademik Platform Mühendislik ve Fen Bilimleri Dergisi Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi %A Ünal Çavuşoğlu , Sezgin Kaçar %T Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi %D 2019 %J Akademik Platform Mühendislik ve Fen Bilimleri Dergisi %P -2147-4575 %V 7 %N 2 %R doi: 10.21541/apjes.418519 %U 10.21541/apjes.418519
ISNAD Çavuşoğlu, Ünal , Kaçar, Sezgin . "Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi". Akademik Platform Mühendislik ve Fen Bilimleri Dergisi 7 / 2 (Mayıs 2019): 205-216 . https://doi.org/10.21541/apjes.418519
AMA Çavuşoğlu Ü , Kaçar S . Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi. APJES. 2019; 7(2): 205-216.
Vancouver Çavuşoğlu Ü , Kaçar S . Anormal Trafik Tespiti için Veri Madenciliği Algoritmalarının Performans Analizi. Akademik Platform Mühendislik ve Fen Bilimleri Dergisi. 2019; 7(2): 216-205.