Year 2019, Volume 3 , Issue 1, Pages 67 - 76 2019-03-30

Detection of Windows Executable Malware Files with Deep Learning
Kötü Amaçlı Windows Çalıştırılabilir Dosyalarının Derin Öğrenme İle Tespiti

Mahmut TOKMAK [1] , Ecir Uğur KÜÇÜKSİLLE [2]


In today's internet age, malware emerges as a serious and growing threat in terms of information security. Therefore, detecting malware is extremely important in terms of preventing harm that malware may cause. In this study, by analyzing Windows Application Programming Interface (API) calls and the optional header sections of Windows executable files, it was tried to detect malware. A data set consisting of malware and benign executable files was created. In this study, 875 portable executable files were used, 283 of them are benign and 592 of them are malware. Each portable executable file in the data set is expressed in vectors by the taking into account Windows application programming interface calls and the optional header sections. Dimension reduction was made on feature vector. The reduced attributes were trained and tested by Deep Learning and detecting malware was achieved. At the end of the study, it was achieved 100% accuracy with Deep Learning.

 

Günümüz internet çağında kötü amaçlı yazılımlar, bilgi güvenliği açısından ciddi ve gelişen bir tehdit olarak karşımıza çıkmaktadır. Bu nedenle kötü amaçlı yazılımların tespit edilmesi, kötü amaçlı yazılımın yol açabileceği zararların önlenmesi açısından son derece önem arz etmektedir. Bu çalışmada Windows uygulama programlama arayüzü (API) çağrıları ve Windows çalıştırılabilir dosyalarının opsiyonel başlık bölümünün ihtiva ettiği alanlar analiz edilerek kötü amaçlı yazılımlar tespit edilmeye çalışılmıştır. Çalışmada, kötü amaçlı ve kötü amaçlı olmayan çalıştırılabilir dosyalarından oluşan bir veri seti oluşturulmuştur. Veri setinde, 592 kötü amaçlı olmayan yazılım ve 283 kötü amaçlı yazılım olmak üzere 875 Windows çalıştırılabilir dosyası kullanılmıştır. Veri setindeki her bir çalıştırılabilir dosya, Windows uygulama programlama arayüzü çağrıları ve opsiyonel başlık alanları ele alınarak vektörel olarak ifade edilmiştir. Öznitelik vektörü üzerinde temel bileşen analizi yapılarak boyut indirgeme işlemi yapılmıştır. İndirgenen öznitelikler Derin Öğrenme ile eğitilip test edilerek kötü amaçlı yazılım tespiti gerçekleştirilmiştir. Çalışmanın sonunda Derin Öğrenme ile % 100 doğruluk değerine erişilmiştir. 

  • Alkan M., Çifter B., Kılıç ET., "Zararlı Yazılım Tespit, Takip ve Analiz Yöntemleri Geliştirilmesi", 6.Uluslararası Bilgi Güvenliği ve Kriptoloji Konferansı, Ankara, Türkiye, 20-21 Eylül 2013.
  • Barros, P., Parisi, G. I., Weber, C., Wermter, S., 2017, "Emotion-Modulated Attention İmproves Expression Recognition: A Deep Learning Model". Neurocomputing, Vol. 253, pp. 104–114.
  • Basu, I., Sinha, N., Bhagat, D., Goswami, S., 2016, "Malware Detection Based on Source Data using Data Mining: A Survey", American Journal Of Advanced Computing, Vol. 3(1). pp. 18-37.
  • Bazrafshan, Z., Hashemi, H., Fard, S. M. H., Hamzeh, A., "A Survey on Heuristic Malware Detection Techniques", In Information and Knowledge Technology (IKT), 2013 5th Conference on, pp. 113–120, IEEE, 2013.
  • Belaoued, M., Mazouzi, S., "Statistical Study of imported APIs by PE Type Malware", In Advanced Networking Distributed Systems and Applications (INDS), 2014 International Conference on, pp. 82–86, IEEE, 2014.
  • Belaoued, M., Mazouzi, S., 2016, "A Chi-Square-Based Decision for Real-Time Malware Detection Using PE-File Features", Journal of Information Processing Systems, Vol. 12(4), pp. 644-660.
  • Byrd, B., Malik, R., Kandalam, V., Liu, Q., "Malware Detection with Computational Intelligence", In Proceedings on the International Conference on Artificial Intelligence (ICAI), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), Las Vegas, USA, 2014.
  • Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G., Chen, J., 2018, "Detection of Malicious Code Variants Based on Deep Learning", IEEE Transactions on Industrial Informatics, Vol. 14(7), pp. 3187-3196.
  • Darshan, S.S., Jaidhar, C.D., 2018, "Performance Evaluation of Filter-based Feature Selection Techniques in Classifying Portable Executable Files", Procedia Computer Science, Vol. 125, pp. 346–356.
  • Gupta, S., Sharma, H., Kaur, S., "Malware Characterization Using Windows API Call Sequences", In International Conference on Security, Privacy, and Applied Cryptography Engineering, 6th International Conference, Hyderabad, India, pp. 271-280, December 2016.
  • Hardy, W., Chen, L., Hou, S., Ye, Y., Li, X. (2016). DL4MD: A deep learning framework for intelligent malware detection. In Proceedings of the International Conference on Data Mining (DMIN), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), Las Vegas, USA, pp. 61-67, July 2016.
  • Kabakuş, A.T., Doğru, İ.A., Çetin, A, 2015, "Android Kötücül Yazılım Tespit ve Koruma Sistemleri", Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, Vol. 31(1), pp. 9-16.
  • Kolosnjaji, B., Zarras, A., Webster, G., Eckert, C., "Deep Learning For Classification Of Malware System Call Sequences", In Australasian Joint Conference on Artificial Intelligence, Hobart, Tas, Australia, pp. 137–149, December 2016.
  • Kumar, A., Kuppusamy, K. S., Aghila, G., 2017, "A Learning Model to Detect Maliciousness of Portable Executable Using Integrated Feature Set", Journal of King Saud University-Computer and Information Sciences.
  • Lim, H., 2016, "Detecting Malicious Behaviors of Software through Analysis of API Sequence k-grams", Computer Science and Information Technology, Vol. 4(3), pp. 85-91.
  • McAfee, https://www.mcafee.com/enterprise/en-us/threat-center/mcafee-labs/reports.html. Tarihinde 22.5.2018.
  • Mezgec, S., Eftimov, T., Bucher, T., Seljak, B.K, 2018, Mixed Deep Learning and Natural Language Processing Method for Fake-Food Image Recognition and Standardization To Help Automated Dietary Assessment", Public Health Nutrition, http://doi.org/10.1017/S1368980018000708, pp. 1–10.
  • Microsoft, https://msdn.microsoft.com/en-us/library/, 02.03.2018.
  • Ng, S.C., 2017, "Principal Component Analysis to Reduce Dimension on Digital Image", Procedia Computer Science, Vol. 111, pp. 113–119.
  • Qiu, X., Ren, Y., Suganthan, P.N., Amaratunga, G.A.J, 2017, "Empirical Mode Decomposition Based Ensemble Deep Learning for Load Demand Time Series Forecasting", Applied Soft Computing, Vol. 54, pp. 246–255.
  • Ranjan, R., Patel, V. M., Chellappa, R. (2017). Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence.
  • Razzak, M.I., Naz, S., Zaib A., 2018, “Deep Learning for Medical Image Processing: Overview, Challenges and the Future”, In: Dey N., Ashour A., Borra S. (eds) Classification in BioApps. Lecture Notes in Computational Vision and Biomechanics, vol 26. Springer, Cham.
  • Salehi, Z., Sami, A., Ghiasi, M. (2014). Using feature generation from API calls for malware detection. Computer Fraud Security, 2014(9), 9–18.
  • Schultz, M. G., Eskin, E., Zadok, F., Stolfo, S.J., 2001, "Data Mining Methods for Detection of New Malicious Executables", In Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on, IEEE, pp. 38–49.
  • Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C., 2009, "Detection of Malicious Code by Applying Machine Learning Classifiers On Static Features: A State-of-The-Art Survey", Information Security Technical Report, Vol. 14(1), pp. 16–29.
  • Siddiqui, S.A., Mercier, D., Munir, M., Dengel, A., Ahmed, S., 2018, "TSViz: Demystification of Deep Learning Models for Time-Series Analysis", arXiv preprint arXiv:1802.02952.
  • VirusShare, https://virusshare.com/, 04.02.2018.
  • VirusTotal, https://www.virustotal.com/#/home/upload, 15.03.2018.
  • W3schools, https://www.w3schools.com/browsers/browsers_os.asp, 25.05.2018
  • Wang, C., Pang, J., Zhao, R., Liu, X., "Using API Sequence and Bayes Algorithm to Detect Suspicious Behavior", In: Communication Software and Networks, 2009. ICCSN’09. International Conference on, IEEE, pp. 544–548, 2009.
  • Ye, Y., Li, T., Jiang, Q., Wang, Y., 2010, "CIMDS: Adapting Postprocessing Techniques of Associative Classification for Malware Detection", IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol. 40(3), pp. 298–307.
  • Ye, Y., Wang, D., Li, T., Ye, D., Jiang, Q., 2008, "An Intelligent PE-Malware Detection System Based on Association Mining", Journal in Computer Virology, Vol. 4(4), pp. 323–334.
  • Young, T., Hazarika, D., Poria, S., Cambria, E., 2017, "Recent Trends İn Deep Learning Based Natural Language Processing", arXiv preprint arXiv:1708.02709.
  • Zatloukal, F., Znoj, J., 2017, "Malware Detection Based on Multiple PE Headers Identification and Optimization for Specific Types of Files". Journal of Advanced Engineering and Computation, Vol 1(2), pp. 153–161.
  • Ze, H., Senior, A., Schuster, M. (2013). Statistical parametric speech synthesis using deep neural networks. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (ss. 7962–7966). IEEE.
  • Zeyer, A., Doetsch, P., Voigtlaender, P., Schlüter, R., Ney, H., "A Comprehensive Study of Deep Bidirectional LSTM Rnns For Acoustic Modeling in Speech Recognition", In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on , pp. 2462-2466. IEEE, 2017.
  • Zhang, Y., Pezeshki, M., Brakel, P., Zhang, S., Bengio, C. L. Y., Courville, A., 2017, "Towards End-To-End Speech Recognition with Deep Convolutional Neural Networks", arXiv preprint arXiv:1701.02720.
Primary Language tr
Subjects Engineering
Journal Section Research Articles
Authors

Orcid: 0000-0003-0632-4308
Author: Mahmut TOKMAK (Primary Author)
Institution: ISPARTA UYGULAMALI BİLİMLER ÜNİVERSİTESİ GELENDOST MESLEK YÜKSEKOKULU
Country: Turkey


Orcid: 0000-0002-3293-9878
Author: Ecir Uğur KÜÇÜKSİLLE
Institution: SÜLEYMAN DEMİREL ÜNİVERSİTESİ MÜHENDİSLİK FAKÜLTESİ BİLGİSAYAR MÜHENDİSLİĞİ

Dates

Publication Date : March 30, 2019

APA Tokmak, M , Küçüksille, E . (2019). Kötü Amaçlı Windows Çalıştırılabilir Dosyalarının Derin Öğrenme İle Tespiti . Bilge International Journal of Science and Technology Research , 3 (1) , 67-76 . DOI: 10.30516/bilgesci.531801