TY - JOUR T1 - Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı TT - Using Machine Learning Algorithms for Malware Detection AU - Tüfekci, Pınar AU - Önal, Çetin Mutlu PY - 2024 DA - January DO - 10.29130/dubited.1287453 JF - Duzce University Journal of Science and Technology JO - DÜBİTED PB - Duzce University WT - DergiPark SN - 2148-2446 SP - 307 EP - 319 VL - 12 IS - 1 LA - tr AB - Gelişen teknoloji sayesinde bilgiye kolay erişim sağlansa da, bu durum kötü amaçlı eylemlerin artışına da sebep olmuştur. Android işletim sistemlerinde sıklıkla rastlanan kötü amaçlı yazılımlar (malware), kullanıcıların cihazındaki verilere erişerek büyük bir tehdit oluşturmaktadır. Bu çalışma, kötü amaçlı yazılımları tespit etmek amacıyla yüksek doğruluklu ve güvenilir bir model geliştirmeyi hedeflemektedir. Modelleme çalışmalarında popüler bir veri seti olan DREBIN-215 Android Malware Dataset kullanılmıştır. Makine Öğrenmesi algoritmaları arasından Support Vector Machines (SVM), Gradient Boosting (GB), Multi Layer Perceptron (MLP), Naïve Bayes (MNB), K-En Yakın Komşu (KNN) ve Random Forest (RF) algoritmaları uygulanmıştır. Algoritmaların performansları, varsayılan parametreler ve GridSearch yöntemiyle elde edilen en iyi hiperparametre değerlerinin kullanılmasıyla değerlendirilmiştir. En başarılı model, SVM algoritmasıyla en iyi hiperparametrelerin uygulanması sonucu %99.07 doğruluk oranıyla elde edilmiştir. KW - Kötü amaçlı yazılım KW - Makine öğrenmesi KW - Support Vector Machines KW - Gradient Boosting KW - Multi Layer Perceptron. N2 - Although advanced technology has facilitated easy access to information, it has also led to an increase in malicious activities. Malware, frequently encountered in Android operating systems, poses a significant threat by accessing users' data on their devices. This study aims to develop a highly accurate and reliable model for detecting malware. The modeling work used the popular and imbalanced dataset, DREBIN-215 Android Malware Dataset. Machine Learning algorithms such as Support Vector Machines (SVM), Gradient Boosting (GB), Multi Layer Perceptron (MLP), Naïve Bayes (MNB), K-Nearest Neighbor (KNN) and Random Forest (RF) were applied. The performance of the algorithms was evaluated using default parameters and the best hyperparameter values obtained through the GridSearch method. The most successful model was achieved with an accuracy rate of 99.07% by applying the best hyperparameters in the SVM algorithm. CR - [1] A. T. Kabakuş, İ. A. Doğru and A. Çetin, "Android kötücül yazılım tespit ve koruma sistemleri", Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, vol. 31, no. 1, pp. 9-16, Feb. 2015. CR - [2] M. Grace, Y. Zhou, Q. Zhang, S. Zou and X. Jiang, "RiskRanker: scalable and accurate zero-day android malware detection", MobiSys '12: Proceedings of the 10th international conference on Mobile systems, applications, and services, June 2012, Pages 281–294, https://doi.org/10.1145/2307636.2307663 CR - [3] N. Zhang, Y. Tan, C. Yang and Y. Li, "Deep learning feature exploration for Android malware detection", Applied Soft Computing, Volume 102, April 2021, https://doi.org/10.1016/j.asoc.2020.107069 CR - [4] A. Razgallah, R. Khoury, S. Halle and K. Khanmohammadi, "A survey of malware detection in Android apps: Recommendations and perspectives for future research", Computer Science Review, Volume 39, February 2021, https://doi.org/10.1016/j.cosrev.2020.100358 CR - [5] A. Guerra-Manzanares, M. Luckner and H. Bahsi, "Concept drift and cross-device behavior: Challenges and implications for effective android malware detection", Computers & Security, Volume 120, September 2022, https://doi.org/10.1016/j.cose.2022.102757 CR - [6] F. Ou and J. Xu, "S3Feature: A static sensitive subgraph-based feature for android malware detection", Computers & Security, Volume 112, January 2022, https://doi.org/10.1016/j.cose.2021.102513 CR - [7] A. Martin, R. Lara-Cabrera and D. Camacho, "Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset", Information Fusion, Volume 52, December 2019, Pages 128-142, https://doi.org/10.1016/j.inffus.2018.12.006 CR - [8] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei and Q. Zheng, "IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture", Computer Networks, Volume 171, 22 April 2020, https://doi.org/10.1016/j.comnet.2020.107138 CR - [9] A. Ananya, P. Vinod and M. Shojafar, "SysDroid: A Dynamic ML-based Android Malware Analyzer using System Call Traces", Cluster Computing, December 2020, DOI:10.1007/s10586-019-03045-6 CR - [10] K. Lin, X. Xu and F. Xiao, "MFFusion: A Multi-level Features Fusion Model for Malicious Traffic Detection based on Deep Learning", Computer Networks, Volume 202, 15 January 2022, https://doi.org/10.1016/j.comnet.2021.108658 CR - [11] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher and M. Portmann, "Graph neural network-based android malware classification with jumping knowledge", 2022 IEEE Conference on Dependable and Secure Computing (DSC) , 1–9, 2022. CR - [12] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. J. Ross and G. Stringhini, "Mamadroid: Detecting android malware by building markov chains of behavioral models", ACM Trans. Priv. Secur. 22, 14:1–14:3, 2019. [13] Y. Wu, X. Li, D. Zou, W. Yang, X. Zhang and H. Jin, "Malscan: Fast market-wide mobile malware scanning by social-network centrality analysis", in: 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, San Diego, CA, USA, IEEE. pp. 139–150, 2019. CR - [14] P. Xu, C. Eckert and A. Zarras, "Detecting and categorizing android malware with graph neural networks", in: SAC ’21: The 36th ACM/SIGAPP Symposium on Applied Computing, pp. 409–412, 2021. CR - [15] H. Gao, S. Cheng and W. Zhang, "Gdroid: Android malware detection and classification with graph convolutional network", Comput. Secur. 106, 2021. CR - [16] M.S. Rana, S. S. M. M. Rahman, and A. H. Sung, "Evaluation of tree based machine learning classifiers for android malware detection", Computational Collective Intelligence: 10th International Conference, ICCCI 2018, Bristol, UK, September 5-7, 2018, Proceedings, Part II 10. Springer International Publishing, 2018. CR - [17] H. Peng, C. Gates, B. Sarma, N. Li, Y. Qi, R. Potharaju, and I. Molloy, I. "Using probabilistic generative models for ranking risks of android apps", In Proceedings of the 2012 ACM conference on Computer and communications security (pp. 241-252), 2012. CR - [18] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. E. R. T. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket", In Ndss, Vol. 14, pp. 23-26, 2014. CR - [19] J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-an and H. Ye, "Significant Permission Identification for Machine-Learning-Based Android Malware Detection", in IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3216-3225, July 2018, doi: 10.1109/TII.2017.2789219. CR - [20] M. Qiao, A. H. Sung and Q. Liu, "Merging Permission and API Features for Android Malware Detection", 2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), Kumamoto, Japan, 2016, pp. 566-571, doi: 10.1109/IIAI-AAI.2016.237. CR - [21] A. Aydın , İ. A. Doğru and M. Dörterler , "Makine Öğrenmesi Algoritmalarıyla Android Kötücül Yazılım Uygulamalarının Tespiti", Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 22, no. 2, pp. 1087-1094, Aug. 2018, doi:10.19113/sdufbed.20066 CR - [22] A. Güngör , İ. Dogru , N. Barışçı and S. Toklu , "Görüntü tabanlı özelliklerden ve makine öğrenmesi yöntemlerinden faydalanılarak kötücül yazılım tespiti", Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, vol. 38, no. 3, pp. 1781-1792, Jan. 2023, doi:10.17341/gazimmfd.994289 CR - [23] A. Utku, İ. A. Doğru and M. A. Akcayol, "Decision tree based android malware detection system", 26th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, 2018, pp. 1-4, doi: 10.1109/SIU.2018.8404151. CR - [24] Z. Liu, R. Wang, N. Japkowicz, H. M. Gomes, B. Peng and W. Zhang, "SeGDroid: An Android malware detection method based on sensitive function call graph learning", Expert Systems with Applications, 2023, https://doi.org/10.1016/j.eswa.2023.121125 CR - [25] S. Yang, Y. Wang, H. Xu, F. Xu and M. Chen, "An Android Malware Detection and Classification Approach Based on Contrastive Lerning", Computers & Security Volume 123, 2022, https://doi.org/10.1016/j.cose.2022.102915 CR - [26] J. Sahs and L. Khan, "A Machine Learning Approach to Android Malware Detection," 2012 European Intelligence and Security Informatics Conference, Odense, Denmark, 2012, pp. 141-147, 2012 doi: 10.1109/EISIC.2012.34. CR - [27] Ö. Kiraz and İ. A. Doğru, "Android Kötücül Yazılım Tespit Sistemleri İncelemesi", Düzce Üniversitesi Bilim ve Teknoloji Dergisi, vol. 5, no. 1, pp. 281-298, Jan. 2017. CR - [28] S. Haykin, "Neural Networks: A Comprehensive Foundation", Prentice- Hall, Ontario, 837s, 1999. CR - [29] J. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine", The Annals of Statistics, 29(5), 11-28, 2000. CR - [30] V. N. Vapnik "The Nature of Static Learing Theory", Springer, 314s, 2000. [31] J. VanderPlas, "Python Data Science Handbook Essential Tools for Working with Data", O'Reilly Media, 2016. CR - [32] G. O. Campos, A. Zimek, J. Sander, R.J.G.B. Campello, B. Micenková, E. Schubert, I. Assent and M.E. Houle, "On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study", Data Mining and Knowledge Discovery, vol. 30, no. 4, pp. 891–927, 2016. [33] L. Breiman, "Random Forests", Statistics Department University of California Berkeley, 1- 33, 2001. CR - [34] B. J. Erickson and F. Kitamura, “Magician's Corner: 9. Performance Metrics for Machine Learning Models”, Radiology. Artificial intelligence vol. 3, 3e, 2021, doi:10.1148/ryai.2021200126. UR - https://doi.org/10.29130/dubited.1287453 L1 - https://dergipark.org.tr/en/download/article-file/3102152 ER -