Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı

Pınar Tüfekci; Çetin Mutlu Önal

doi:10.29130/dubited.1287453

Araştırma Makalesi

Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı

Yıl 2024, Cilt: 12 Sayı: 1, 307 - 319, 26.01.2024

Pınar Tüfekci Çetin Mutlu Önal

https://doi.org/10.29130/dubited.1287453

Öz

Gelişen teknoloji sayesinde bilgiye kolay erişim sağlansa da, bu durum kötü amaçlı eylemlerin artışına da sebep olmuştur. Android işletim sistemlerinde sıklıkla rastlanan kötü amaçlı yazılımlar (malware), kullanıcıların cihazındaki verilere erişerek büyük bir tehdit oluşturmaktadır. Bu çalışma, kötü amaçlı yazılımları tespit etmek amacıyla yüksek doğruluklu ve güvenilir bir model geliştirmeyi hedeflemektedir. Modelleme çalışmalarında popüler bir veri seti olan DREBIN-215 Android Malware Dataset kullanılmıştır. Makine Öğrenmesi algoritmaları arasından Support Vector Machines (SVM), Gradient Boosting (GB), Multi Layer Perceptron (MLP), Naïve Bayes (MNB), K-En Yakın Komşu (KNN) ve Random Forest (RF) algoritmaları uygulanmıştır. Algoritmaların performansları, varsayılan parametreler ve GridSearch yöntemiyle elde edilen en iyi hiperparametre değerlerinin kullanılmasıyla değerlendirilmiştir. En başarılı model, SVM algoritmasıyla en iyi hiperparametrelerin uygulanması sonucu %99.07 doğruluk oranıyla elde edilmiştir.

Anahtar Kelimeler

Kötü amaçlı yazılım, Makine öğrenmesi, Support Vector Machines, Gradient Boosting, Multi Layer Perceptron.

Kaynakça

[1] A. T. Kabakuş, İ. A. Doğru and A. Çetin, "Android kötücül yazılım tespit ve koruma sistemleri", Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, vol. 31, no. 1, pp. 9-16, Feb. 2015.
[2] M. Grace, Y. Zhou, Q. Zhang, S. Zou and X. Jiang, "RiskRanker: scalable and accurate zero-day android malware detection", MobiSys '12: Proceedings of the 10th international conference on Mobile systems, applications, and services, June 2012, Pages 281–294, https://doi.org/10.1145/2307636.2307663
[3] N. Zhang, Y. Tan, C. Yang and Y. Li, "Deep learning feature exploration for Android malware detection", Applied Soft Computing, Volume 102, April 2021, https://doi.org/10.1016/j.asoc.2020.107069
[4] A. Razgallah, R. Khoury, S. Halle and K. Khanmohammadi, "A survey of malware detection in Android apps: Recommendations and perspectives for future research", Computer Science Review, Volume 39, February 2021, https://doi.org/10.1016/j.cosrev.2020.100358
[5] A. Guerra-Manzanares, M. Luckner and H. Bahsi, "Concept drift and cross-device behavior: Challenges and implications for effective android malware detection", Computers & Security, Volume 120, September 2022, https://doi.org/10.1016/j.cose.2022.102757
[6] F. Ou and J. Xu, "S3Feature: A static sensitive subgraph-based feature for android malware detection", Computers & Security, Volume 112, January 2022, https://doi.org/10.1016/j.cose.2021.102513
[7] A. Martin, R. Lara-Cabrera and D. Camacho, "Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset", Information Fusion, Volume 52, December 2019, Pages 128-142, https://doi.org/10.1016/j.inffus.2018.12.006
[8] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei and Q. Zheng, "IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture", Computer Networks, Volume 171, 22 April 2020, https://doi.org/10.1016/j.comnet.2020.107138
[9] A. Ananya, P. Vinod and M. Shojafar, "SysDroid: A Dynamic ML-based Android Malware Analyzer using System Call Traces", Cluster Computing, December 2020, DOI:10.1007/s10586-019-03045-6
[10] K. Lin, X. Xu and F. Xiao, "MFFusion: A Multi-level Features Fusion Model for Malicious Traffic Detection based on Deep Learning", Computer Networks, Volume 202, 15 January 2022, https://doi.org/10.1016/j.comnet.2021.108658
[11] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher and M. Portmann, "Graph neural network-based android malware classification with jumping knowledge", 2022 IEEE Conference on Dependable and Secure Computing (DSC) , 1–9, 2022.
[12] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. J. Ross and G. Stringhini, "Mamadroid: Detecting android malware by building markov chains of behavioral models", ACM Trans. Priv. Secur. 22, 14:1–14:3, 2019. [13] Y. Wu, X. Li, D. Zou, W. Yang, X. Zhang and H. Jin, "Malscan: Fast market-wide mobile malware scanning by social-network centrality analysis", in: 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, San Diego, CA, USA, IEEE. pp. 139–150, 2019.
[14] P. Xu, C. Eckert and A. Zarras, "Detecting and categorizing android malware with graph neural networks", in: SAC ’21: The 36th ACM/SIGAPP Symposium on Applied Computing, pp. 409–412, 2021.
[15] H. Gao, S. Cheng and W. Zhang, "Gdroid: Android malware detection and classification with graph convolutional network", Comput. Secur. 106, 2021.
[16] M.S. Rana, S. S. M. M. Rahman, and A. H. Sung, "Evaluation of tree based machine learning classifiers for android malware detection", Computational Collective Intelligence: 10th International Conference, ICCCI 2018, Bristol, UK, September 5-7, 2018, Proceedings, Part II 10. Springer International Publishing, 2018.
[17] H. Peng, C. Gates, B. Sarma, N. Li, Y. Qi, R. Potharaju, and I. Molloy, I. "Using probabilistic generative models for ranking risks of android apps", In Proceedings of the 2012 ACM conference on Computer and communications security (pp. 241-252), 2012.
[18] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. E. R. T. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket", In Ndss, Vol. 14, pp. 23-26, 2014.
[19] J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-an and H. Ye, "Significant Permission Identification for Machine-Learning-Based Android Malware Detection", in IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3216-3225, July 2018, doi: 10.1109/TII.2017.2789219.
[20] M. Qiao, A. H. Sung and Q. Liu, "Merging Permission and API Features for Android Malware Detection", 2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), Kumamoto, Japan, 2016, pp. 566-571, doi: 10.1109/IIAI-AAI.2016.237.
[21] A. Aydın , İ. A. Doğru and M. Dörterler , "Makine Öğrenmesi Algoritmalarıyla Android Kötücül Yazılım Uygulamalarının Tespiti", Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 22, no. 2, pp. 1087-1094, Aug. 2018, doi:10.19113/sdufbed.20066
[22] A. Güngör , İ. Dogru , N. Barışçı and S. Toklu , "Görüntü tabanlı özelliklerden ve makine öğrenmesi yöntemlerinden faydalanılarak kötücül yazılım tespiti", Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, vol. 38, no. 3, pp. 1781-1792, Jan. 2023, doi:10.17341/gazimmfd.994289
[23] A. Utku, İ. A. Doğru and M. A. Akcayol, "Decision tree based android malware detection system", 26th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, 2018, pp. 1-4, doi: 10.1109/SIU.2018.8404151.
[24] Z. Liu, R. Wang, N. Japkowicz, H. M. Gomes, B. Peng and W. Zhang, "SeGDroid: An Android malware detection method based on sensitive function call graph learning", Expert Systems with Applications, 2023, https://doi.org/10.1016/j.eswa.2023.121125
[25] S. Yang, Y. Wang, H. Xu, F. Xu and M. Chen, "An Android Malware Detection and Classification Approach Based on Contrastive Lerning", Computers & Security Volume 123, 2022, https://doi.org/10.1016/j.cose.2022.102915
[26] J. Sahs and L. Khan, "A Machine Learning Approach to Android Malware Detection," 2012 European Intelligence and Security Informatics Conference, Odense, Denmark, 2012, pp. 141-147, 2012 doi: 10.1109/EISIC.2012.34.
[27] Ö. Kiraz and İ. A. Doğru, "Android Kötücül Yazılım Tespit Sistemleri İncelemesi", Düzce Üniversitesi Bilim ve Teknoloji Dergisi, vol. 5, no. 1, pp. 281-298, Jan. 2017.
[28] S. Haykin, "Neural Networks: A Comprehensive Foundation", Prentice- Hall, Ontario, 837s, 1999.
[29] J. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine", The Annals of Statistics, 29(5), 11-28, 2000.
[30] V. N. Vapnik "The Nature of Static Learing Theory", Springer, 314s, 2000. [31] J. VanderPlas, "Python Data Science Handbook Essential Tools for Working with Data", O'Reilly Media, 2016.
[32] G. O. Campos, A. Zimek, J. Sander, R.J.G.B. Campello, B. Micenková, E. Schubert, I. Assent and M.E. Houle, "On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study", Data Mining and Knowledge Discovery, vol. 30, no. 4, pp. 891–927, 2016. [33] L. Breiman, "Random Forests", Statistics Department University of California Berkeley, 1- 33, 2001.
[34] B. J. Erickson and F. Kitamura, “Magician's Corner: 9. Performance Metrics for Machine Learning Models”, Radiology. Artificial intelligence vol. 3, 3e, 2021, doi:10.1148/ryai.2021200126.

Using Machine Learning Algorithms for Malware Detection

Yıl 2024, Cilt: 12 Sayı: 1, 307 - 319, 26.01.2024

Pınar Tüfekci Çetin Mutlu Önal

https://doi.org/10.29130/dubited.1287453

Öz

Although advanced technology has facilitated easy access to information, it has also led to an increase in malicious activities. Malware, frequently encountered in Android operating systems, poses a significant threat by accessing users' data on their devices. This study aims to develop a highly accurate and reliable model for detecting malware. The modeling work used the popular and imbalanced dataset, DREBIN-215 Android Malware Dataset. Machine Learning algorithms such as Support Vector Machines (SVM), Gradient Boosting (GB), Multi Layer Perceptron (MLP), Naïve Bayes (MNB), K-Nearest Neighbor (KNN) and Random Forest (RF) were applied. The performance of the algorithms was evaluated using default parameters and the best hyperparameter values obtained through the GridSearch method. The most successful model was achieved with an accuracy rate of 99.07% by applying the best hyperparameters in the SVM algorithm.

Anahtar Kelimeler

Malware, Machine learning, Support Vector Machines, Gradient Boosting, Multi Layer Perceptron

Kaynakça

[1] A. T. Kabakuş, İ. A. Doğru and A. Çetin, "Android kötücül yazılım tespit ve koruma sistemleri", Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, vol. 31, no. 1, pp. 9-16, Feb. 2015.
[2] M. Grace, Y. Zhou, Q. Zhang, S. Zou and X. Jiang, "RiskRanker: scalable and accurate zero-day android malware detection", MobiSys '12: Proceedings of the 10th international conference on Mobile systems, applications, and services, June 2012, Pages 281–294, https://doi.org/10.1145/2307636.2307663
[3] N. Zhang, Y. Tan, C. Yang and Y. Li, "Deep learning feature exploration for Android malware detection", Applied Soft Computing, Volume 102, April 2021, https://doi.org/10.1016/j.asoc.2020.107069
[4] A. Razgallah, R. Khoury, S. Halle and K. Khanmohammadi, "A survey of malware detection in Android apps: Recommendations and perspectives for future research", Computer Science Review, Volume 39, February 2021, https://doi.org/10.1016/j.cosrev.2020.100358
[5] A. Guerra-Manzanares, M. Luckner and H. Bahsi, "Concept drift and cross-device behavior: Challenges and implications for effective android malware detection", Computers & Security, Volume 120, September 2022, https://doi.org/10.1016/j.cose.2022.102757
[6] F. Ou and J. Xu, "S3Feature: A static sensitive subgraph-based feature for android malware detection", Computers & Security, Volume 112, January 2022, https://doi.org/10.1016/j.cose.2021.102513
[7] A. Martin, R. Lara-Cabrera and D. Camacho, "Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset", Information Fusion, Volume 52, December 2019, Pages 128-142, https://doi.org/10.1016/j.inffus.2018.12.006
[8] D. Vasan, M. Alazab, S. Wassan, H. Naeem, B. Safaei and Q. Zheng, "IMCFN: Image-based malware classification using fine-tuned convolutional neural network architecture", Computer Networks, Volume 171, 22 April 2020, https://doi.org/10.1016/j.comnet.2020.107138
[9] A. Ananya, P. Vinod and M. Shojafar, "SysDroid: A Dynamic ML-based Android Malware Analyzer using System Call Traces", Cluster Computing, December 2020, DOI:10.1007/s10586-019-03045-6
[10] K. Lin, X. Xu and F. Xiao, "MFFusion: A Multi-level Features Fusion Model for Malicious Traffic Detection based on Deep Learning", Computer Networks, Volume 202, 15 January 2022, https://doi.org/10.1016/j.comnet.2021.108658
[11] W. W. Lo, S. Layeghy, M. Sarhan, M. Gallagher and M. Portmann, "Graph neural network-based android malware classification with jumping knowledge", 2022 IEEE Conference on Dependable and Secure Computing (DSC) , 1–9, 2022.
[12] L. Onwuzurike, E. Mariconti, P. Andriotis, E. D. Cristofaro, G. J. Ross and G. Stringhini, "Mamadroid: Detecting android malware by building markov chains of behavioral models", ACM Trans. Priv. Secur. 22, 14:1–14:3, 2019. [13] Y. Wu, X. Li, D. Zou, W. Yang, X. Zhang and H. Jin, "Malscan: Fast market-wide mobile malware scanning by social-network centrality analysis", in: 34th IEEE/ACM International Conference on Automated Software Engineering, ASE 2019, San Diego, CA, USA, IEEE. pp. 139–150, 2019.
[14] P. Xu, C. Eckert and A. Zarras, "Detecting and categorizing android malware with graph neural networks", in: SAC ’21: The 36th ACM/SIGAPP Symposium on Applied Computing, pp. 409–412, 2021.
[15] H. Gao, S. Cheng and W. Zhang, "Gdroid: Android malware detection and classification with graph convolutional network", Comput. Secur. 106, 2021.
[16] M.S. Rana, S. S. M. M. Rahman, and A. H. Sung, "Evaluation of tree based machine learning classifiers for android malware detection", Computational Collective Intelligence: 10th International Conference, ICCCI 2018, Bristol, UK, September 5-7, 2018, Proceedings, Part II 10. Springer International Publishing, 2018.
[17] H. Peng, C. Gates, B. Sarma, N. Li, Y. Qi, R. Potharaju, and I. Molloy, I. "Using probabilistic generative models for ranking risks of android apps", In Proceedings of the 2012 ACM conference on Computer and communications security (pp. 241-252), 2012.
[18] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. E. R. T. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket", In Ndss, Vol. 14, pp. 23-26, 2014.
[19] J. Li, L. Sun, Q. Yan, Z. Li, W. Srisa-an and H. Ye, "Significant Permission Identification for Machine-Learning-Based Android Malware Detection", in IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3216-3225, July 2018, doi: 10.1109/TII.2017.2789219.
[20] M. Qiao, A. H. Sung and Q. Liu, "Merging Permission and API Features for Android Malware Detection", 2016 5th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI), Kumamoto, Japan, 2016, pp. 566-571, doi: 10.1109/IIAI-AAI.2016.237.
[21] A. Aydın , İ. A. Doğru and M. Dörterler , "Makine Öğrenmesi Algoritmalarıyla Android Kötücül Yazılım Uygulamalarının Tespiti", Süleyman Demirel Üniversitesi Fen Bilimleri Enstitüsü Dergisi, vol. 22, no. 2, pp. 1087-1094, Aug. 2018, doi:10.19113/sdufbed.20066
[22] A. Güngör , İ. Dogru , N. Barışçı and S. Toklu , "Görüntü tabanlı özelliklerden ve makine öğrenmesi yöntemlerinden faydalanılarak kötücül yazılım tespiti", Gazi Üniversitesi Mühendislik Mimarlık Fakültesi Dergisi, vol. 38, no. 3, pp. 1781-1792, Jan. 2023, doi:10.17341/gazimmfd.994289
[23] A. Utku, İ. A. Doğru and M. A. Akcayol, "Decision tree based android malware detection system", 26th Signal Processing and Communications Applications Conference (SIU), Izmir, Turkey, 2018, pp. 1-4, doi: 10.1109/SIU.2018.8404151.
[24] Z. Liu, R. Wang, N. Japkowicz, H. M. Gomes, B. Peng and W. Zhang, "SeGDroid: An Android malware detection method based on sensitive function call graph learning", Expert Systems with Applications, 2023, https://doi.org/10.1016/j.eswa.2023.121125
[25] S. Yang, Y. Wang, H. Xu, F. Xu and M. Chen, "An Android Malware Detection and Classification Approach Based on Contrastive Lerning", Computers & Security Volume 123, 2022, https://doi.org/10.1016/j.cose.2022.102915
[26] J. Sahs and L. Khan, "A Machine Learning Approach to Android Malware Detection," 2012 European Intelligence and Security Informatics Conference, Odense, Denmark, 2012, pp. 141-147, 2012 doi: 10.1109/EISIC.2012.34.
[27] Ö. Kiraz and İ. A. Doğru, "Android Kötücül Yazılım Tespit Sistemleri İncelemesi", Düzce Üniversitesi Bilim ve Teknoloji Dergisi, vol. 5, no. 1, pp. 281-298, Jan. 2017.
[28] S. Haykin, "Neural Networks: A Comprehensive Foundation", Prentice- Hall, Ontario, 837s, 1999.
[29] J. Friedman, "Greedy Function Approximation: A Gradient Boosting Machine", The Annals of Statistics, 29(5), 11-28, 2000.
[30] V. N. Vapnik "The Nature of Static Learing Theory", Springer, 314s, 2000. [31] J. VanderPlas, "Python Data Science Handbook Essential Tools for Working with Data", O'Reilly Media, 2016.
[32] G. O. Campos, A. Zimek, J. Sander, R.J.G.B. Campello, B. Micenková, E. Schubert, I. Assent and M.E. Houle, "On the evaluation of unsupervised outlier detection: measures, datasets, and an empirical study", Data Mining and Knowledge Discovery, vol. 30, no. 4, pp. 891–927, 2016. [33] L. Breiman, "Random Forests", Statistics Department University of California Berkeley, 1- 33, 2001.
[34] B. J. Erickson and F. Kitamura, “Magician's Corner: 9. Performance Metrics for Machine Learning Models”, Radiology. Artificial intelligence vol. 3, 3e, 2021, doi:10.1148/ryai.2021200126.

Toplam 31 adet kaynakça vardır.

Ayrıntılar

Birincil Dil	Türkçe
Konular	Mühendislik
Bölüm	Makaleler
Yazarlar	Pınar Tüfekci 0000-0003-4842-2635 Çetin Mutlu Önal 0009-0009-9136-2608
Yayımlanma Tarihi	26 Ocak 2024
Yayımlandığı Sayı	Yıl 2024 Cilt: 12 Sayı: 1

Kaynak Göster

APA	Tüfekci, P., & Önal, Ç. M. (2024). Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. Düzce Üniversitesi Bilim Ve Teknoloji Dergisi, 12(1), 307-319. https://doi.org/10.29130/dubited.1287453
AMA	Tüfekci P, Önal ÇM. Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. DÜBİTED. Ocak 2024;12(1):307-319. doi:10.29130/dubited.1287453
Chicago	Tüfekci, Pınar, ve Çetin Mutlu Önal. “Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı”. Düzce Üniversitesi Bilim Ve Teknoloji Dergisi 12, sy. 1 (Ocak 2024): 307-19. https://doi.org/10.29130/dubited.1287453.
EndNote	Tüfekci P, Önal ÇM (01 Ocak 2024) Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. Düzce Üniversitesi Bilim ve Teknoloji Dergisi 12 1 307–319.
IEEE	P. Tüfekci ve Ç. M. Önal, “Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı”, DÜBİTED, c. 12, sy. 1, ss. 307–319, 2024, doi: 10.29130/dubited.1287453.
ISNAD	Tüfekci, Pınar - Önal, Çetin Mutlu. “Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı”. Düzce Üniversitesi Bilim ve Teknoloji Dergisi 12/1 (Ocak 2024), 307-319. https://doi.org/10.29130/dubited.1287453.
JAMA	Tüfekci P, Önal ÇM. Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. DÜBİTED. 2024;12:307–319.
MLA	Tüfekci, Pınar ve Çetin Mutlu Önal. “Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı”. Düzce Üniversitesi Bilim Ve Teknoloji Dergisi, c. 12, sy. 1, 2024, ss. 307-19, doi:10.29130/dubited.1287453.
Vancouver	Tüfekci P, Önal ÇM. Kötü Amaçlı Yazılım Tespiti için Makine Öğrenmesi Algoritmalarının Kullanımı. DÜBİTED. 2024;12(1):307-19.

Kapak Resmi İndir

Makale Dosyaları

Tam Metin