Kötü Amaçlı Windows Çalıştırılabilir Dosyalarının Derin Öğrenme İle Tespiti

Mahmut Tokmak; Ecir Uğur Küçüksille

doi:10.30516/bilgesci.531801

Other

Kötü Amaçlı Windows Çalıştırılabilir Dosyalarının Derin Öğrenme İle Tespiti

Year 2019, Volume: 3 Issue: 1, 67 - 76, 30.03.2019

Mahmut Tokmak , Ecir Uğur Küçüksille

https://doi.org/10.30516/bilgesci.531801

Cited By: 3

Abstract

Günümüz internet çağında kötü amaçlı
yazılımlar, bilgi güvenliği açısından ciddi ve gelişen bir tehdit olarak
karşımıza çıkmaktadır. Bu nedenle kötü amaçlı yazılımların tespit edilmesi,
kötü amaçlı yazılımın yol açabileceği zararların önlenmesi açısından son derece
önem arz etmektedir. Bu çalışmada Windows uygulama programlama arayüzü (API) çağrıları
ve Windows çalıştırılabilir dosyalarının opsiyonel başlık bölümünün ihtiva
ettiği alanlar analiz edilerek kötü amaçlı yazılımlar tespit edilmeye
çalışılmıştır. Çalışmada, kötü amaçlı ve kötü amaçlı olmayan çalıştırılabilir
dosyalarından oluşan bir veri seti oluşturulmuştur. Veri setinde, 592 kötü
amaçlı olmayan yazılım ve 283 kötü amaçlı yazılım olmak üzere 875 Windows çalıştırılabilir
dosyası kullanılmıştır. Veri setindeki her bir çalıştırılabilir dosya, Windows uygulama
programlama arayüzü çağrıları ve opsiyonel başlık alanları ele alınarak vektörel
olarak ifade edilmiştir. Öznitelik vektörü üzerinde temel bileşen analizi
yapılarak boyut indirgeme işlemi yapılmıştır. İndirgenen öznitelikler Derin
Öğrenme ile eğitilip test edilerek kötü amaçlı yazılım tespiti
gerçekleştirilmiştir. Çalışmanın sonunda Derin Öğrenme ile % 100 doğruluk
değerine erişilmiştir.

Keywords

Kötü amaçlı yazılım tespiti , Bilgi ve bilgisayar güvenliği , Derin öğrenme

References

Alkan M., Çifter B., Kılıç ET., "Zararlı Yazılım Tespit, Takip ve Analiz Yöntemleri Geliştirilmesi", 6.Uluslararası Bilgi Güvenliği ve Kriptoloji Konferansı, Ankara, Türkiye, 20-21 Eylül 2013.
Barros, P., Parisi, G. I., Weber, C., Wermter, S., 2017, "Emotion-Modulated Attention İmproves Expression Recognition: A Deep Learning Model". Neurocomputing, Vol. 253, pp. 104–114.
Basu, I., Sinha, N., Bhagat, D., Goswami, S., 2016, "Malware Detection Based on Source Data using Data Mining: A Survey", American Journal Of Advanced Computing, Vol. 3(1). pp. 18-37.
Bazrafshan, Z., Hashemi, H., Fard, S. M. H., Hamzeh, A., "A Survey on Heuristic Malware Detection Techniques", In Information and Knowledge Technology (IKT), 2013 5th Conference on, pp. 113–120, IEEE, 2013.
Belaoued, M., Mazouzi, S., "Statistical Study of imported APIs by PE Type Malware", In Advanced Networking Distributed Systems and Applications (INDS), 2014 International Conference on, pp. 82–86, IEEE, 2014.
Belaoued, M., Mazouzi, S., 2016, "A Chi-Square-Based Decision for Real-Time Malware Detection Using PE-File Features", Journal of Information Processing Systems, Vol. 12(4), pp. 644-660.
Byrd, B., Malik, R., Kandalam, V., Liu, Q., "Malware Detection with Computational Intelligence", In Proceedings on the International Conference on Artificial Intelligence (ICAI), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), Las Vegas, USA, 2014.
Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G., Chen, J., 2018, "Detection of Malicious Code Variants Based on Deep Learning", IEEE Transactions on Industrial Informatics, Vol. 14(7), pp. 3187-3196.
Darshan, S.S., Jaidhar, C.D., 2018, "Performance Evaluation of Filter-based Feature Selection Techniques in Classifying Portable Executable Files", Procedia Computer Science, Vol. 125, pp. 346–356.
Gupta, S., Sharma, H., Kaur, S., "Malware Characterization Using Windows API Call Sequences", In International Conference on Security, Privacy, and Applied Cryptography Engineering, 6th International Conference, Hyderabad, India, pp. 271-280, December 2016.
Hardy, W., Chen, L., Hou, S., Ye, Y., Li, X. (2016). DL4MD: A deep learning framework for intelligent malware detection. In Proceedings of the International Conference on Data Mining (DMIN), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), Las Vegas, USA, pp. 61-67, July 2016.
Kabakuş, A.T., Doğru, İ.A., Çetin, A, 2015, "Android Kötücül Yazılım Tespit ve Koruma Sistemleri", Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, Vol. 31(1), pp. 9-16.
Kolosnjaji, B., Zarras, A., Webster, G., Eckert, C., "Deep Learning For Classification Of Malware System Call Sequences", In Australasian Joint Conference on Artificial Intelligence, Hobart, Tas, Australia, pp. 137–149, December 2016.
Kumar, A., Kuppusamy, K. S., Aghila, G., 2017, "A Learning Model to Detect Maliciousness of Portable Executable Using Integrated Feature Set", Journal of King Saud University-Computer and Information Sciences.
Lim, H., 2016, "Detecting Malicious Behaviors of Software through Analysis of API Sequence k-grams", Computer Science and Information Technology, Vol. 4(3), pp. 85-91.
McAfee, https://www.mcafee.com/enterprise/en-us/threat-center/mcafee-labs/reports.html. Tarihinde 22.5.2018.
Mezgec, S., Eftimov, T., Bucher, T., Seljak, B.K, 2018, Mixed Deep Learning and Natural Language Processing Method for Fake-Food Image Recognition and Standardization To Help Automated Dietary Assessment", Public Health Nutrition, http://doi.org/10.1017/S1368980018000708, pp. 1–10.
Microsoft, https://msdn.microsoft.com/en-us/library/, 02.03.2018.
Ng, S.C., 2017, "Principal Component Analysis to Reduce Dimension on Digital Image", Procedia Computer Science, Vol. 111, pp. 113–119.
Qiu, X., Ren, Y., Suganthan, P.N., Amaratunga, G.A.J, 2017, "Empirical Mode Decomposition Based Ensemble Deep Learning for Load Demand Time Series Forecasting", Applied Soft Computing, Vol. 54, pp. 246–255.
Ranjan, R., Patel, V. M., Chellappa, R. (2017). Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Razzak, M.I., Naz, S., Zaib A., 2018, “Deep Learning for Medical Image Processing: Overview, Challenges and the Future”, In: Dey N., Ashour A., Borra S. (eds) Classification in BioApps. Lecture Notes in Computational Vision and Biomechanics, vol 26. Springer, Cham.
Salehi, Z., Sami, A., Ghiasi, M. (2014). Using feature generation from API calls for malware detection. Computer Fraud Security, 2014(9), 9–18.
Schultz, M. G., Eskin, E., Zadok, F., Stolfo, S.J., 2001, "Data Mining Methods for Detection of New Malicious Executables", In Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on, IEEE, pp. 38–49.
Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C., 2009, "Detection of Malicious Code by Applying Machine Learning Classifiers On Static Features: A State-of-The-Art Survey", Information Security Technical Report, Vol. 14(1), pp. 16–29.
Siddiqui, S.A., Mercier, D., Munir, M., Dengel, A., Ahmed, S., 2018, "TSViz: Demystification of Deep Learning Models for Time-Series Analysis", arXiv preprint arXiv:1802.02952.
VirusShare, https://virusshare.com/, 04.02.2018.
VirusTotal, https://www.virustotal.com/#/home/upload, 15.03.2018.
W3schools, https://www.w3schools.com/browsers/browsers_os.asp, 25.05.2018
Wang, C., Pang, J., Zhao, R., Liu, X., "Using API Sequence and Bayes Algorithm to Detect Suspicious Behavior", In: Communication Software and Networks, 2009. ICCSN’09. International Conference on, IEEE, pp. 544–548, 2009.
Ye, Y., Li, T., Jiang, Q., Wang, Y., 2010, "CIMDS: Adapting Postprocessing Techniques of Associative Classification for Malware Detection", IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol. 40(3), pp. 298–307.
Ye, Y., Wang, D., Li, T., Ye, D., Jiang, Q., 2008, "An Intelligent PE-Malware Detection System Based on Association Mining", Journal in Computer Virology, Vol. 4(4), pp. 323–334.
Young, T., Hazarika, D., Poria, S., Cambria, E., 2017, "Recent Trends İn Deep Learning Based Natural Language Processing", arXiv preprint arXiv:1708.02709.
Zatloukal, F., Znoj, J., 2017, "Malware Detection Based on Multiple PE Headers Identification and Optimization for Specific Types of Files". Journal of Advanced Engineering and Computation, Vol 1(2), pp. 153–161.
Ze, H., Senior, A., Schuster, M. (2013). Statistical parametric speech synthesis using deep neural networks. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (ss. 7962–7966). IEEE.
Zeyer, A., Doetsch, P., Voigtlaender, P., Schlüter, R., Ney, H., "A Comprehensive Study of Deep Bidirectional LSTM Rnns For Acoustic Modeling in Speech Recognition", In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on , pp. 2462-2466. IEEE, 2017.
Zhang, Y., Pezeshki, M., Brakel, P., Zhang, S., Bengio, C. L. Y., Courville, A., 2017, "Towards End-To-End Speech Recognition with Deep Convolutional Neural Networks", arXiv preprint arXiv:1701.02720.

Detection of Windows Executable Malware Files with Deep Learning

Year 2019, Volume: 3 Issue: 1, 67 - 76, 30.03.2019

Mahmut Tokmak , Ecir Uğur Küçüksille

https://doi.org/10.30516/bilgesci.531801

Cited By: 3

Abstract

In today's internet age, malware
emerges as a serious and growing threat in terms of information security.
Therefore, detecting malware is extremely important in terms of preventing harm
that malware may cause. In this study, by analyzing Windows Application Programming
Interface (API) calls and the optional header sections of Windows executable files,
it was tried to detect malware. A data set consisting of malware and benign
executable files was created. In this study, 875 portable executable files were
used, 283 of them are benign and 592 of them are malware. Each portable
executable file in the data set is expressed in vectors by the taking into
account Windows application programming interface calls and the optional header
sections. Dimension reduction was made on feature vector. The reduced
attributes were trained and tested by Deep Learning and detecting malware was
achieved. At the end of the study, it was achieved 100% accuracy with Deep
Learning.

Keywords

Malware detection , Information and computer security , Deep learning

References

Alkan M., Çifter B., Kılıç ET., "Zararlı Yazılım Tespit, Takip ve Analiz Yöntemleri Geliştirilmesi", 6.Uluslararası Bilgi Güvenliği ve Kriptoloji Konferansı, Ankara, Türkiye, 20-21 Eylül 2013.
Barros, P., Parisi, G. I., Weber, C., Wermter, S., 2017, "Emotion-Modulated Attention İmproves Expression Recognition: A Deep Learning Model". Neurocomputing, Vol. 253, pp. 104–114.
Basu, I., Sinha, N., Bhagat, D., Goswami, S., 2016, "Malware Detection Based on Source Data using Data Mining: A Survey", American Journal Of Advanced Computing, Vol. 3(1). pp. 18-37.
Bazrafshan, Z., Hashemi, H., Fard, S. M. H., Hamzeh, A., "A Survey on Heuristic Malware Detection Techniques", In Information and Knowledge Technology (IKT), 2013 5th Conference on, pp. 113–120, IEEE, 2013.
Belaoued, M., Mazouzi, S., "Statistical Study of imported APIs by PE Type Malware", In Advanced Networking Distributed Systems and Applications (INDS), 2014 International Conference on, pp. 82–86, IEEE, 2014.
Belaoued, M., Mazouzi, S., 2016, "A Chi-Square-Based Decision for Real-Time Malware Detection Using PE-File Features", Journal of Information Processing Systems, Vol. 12(4), pp. 644-660.
Byrd, B., Malik, R., Kandalam, V., Liu, Q., "Malware Detection with Computational Intelligence", In Proceedings on the International Conference on Artificial Intelligence (ICAI), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), Las Vegas, USA, 2014.
Cui, Z., Xue, F., Cai, X., Cao, Y., Wang, G., Chen, J., 2018, "Detection of Malicious Code Variants Based on Deep Learning", IEEE Transactions on Industrial Informatics, Vol. 14(7), pp. 3187-3196.
Darshan, S.S., Jaidhar, C.D., 2018, "Performance Evaluation of Filter-based Feature Selection Techniques in Classifying Portable Executable Files", Procedia Computer Science, Vol. 125, pp. 346–356.
Gupta, S., Sharma, H., Kaur, S., "Malware Characterization Using Windows API Call Sequences", In International Conference on Security, Privacy, and Applied Cryptography Engineering, 6th International Conference, Hyderabad, India, pp. 271-280, December 2016.
Hardy, W., Chen, L., Hou, S., Ye, Y., Li, X. (2016). DL4MD: A deep learning framework for intelligent malware detection. In Proceedings of the International Conference on Data Mining (DMIN), The Steering Committee of The World Congress in Computer Science, Computer Engineering and Applied Computing (WorldComp), Las Vegas, USA, pp. 61-67, July 2016.
Kabakuş, A.T., Doğru, İ.A., Çetin, A, 2015, "Android Kötücül Yazılım Tespit ve Koruma Sistemleri", Erciyes Üniversitesi Fen Bilimleri Enstitüsü Fen Bilimleri Dergisi, Vol. 31(1), pp. 9-16.
Kolosnjaji, B., Zarras, A., Webster, G., Eckert, C., "Deep Learning For Classification Of Malware System Call Sequences", In Australasian Joint Conference on Artificial Intelligence, Hobart, Tas, Australia, pp. 137–149, December 2016.
Kumar, A., Kuppusamy, K. S., Aghila, G., 2017, "A Learning Model to Detect Maliciousness of Portable Executable Using Integrated Feature Set", Journal of King Saud University-Computer and Information Sciences.
Lim, H., 2016, "Detecting Malicious Behaviors of Software through Analysis of API Sequence k-grams", Computer Science and Information Technology, Vol. 4(3), pp. 85-91.
McAfee, https://www.mcafee.com/enterprise/en-us/threat-center/mcafee-labs/reports.html. Tarihinde 22.5.2018.
Mezgec, S., Eftimov, T., Bucher, T., Seljak, B.K, 2018, Mixed Deep Learning and Natural Language Processing Method for Fake-Food Image Recognition and Standardization To Help Automated Dietary Assessment", Public Health Nutrition, http://doi.org/10.1017/S1368980018000708, pp. 1–10.
Microsoft, https://msdn.microsoft.com/en-us/library/, 02.03.2018.
Ng, S.C., 2017, "Principal Component Analysis to Reduce Dimension on Digital Image", Procedia Computer Science, Vol. 111, pp. 113–119.
Qiu, X., Ren, Y., Suganthan, P.N., Amaratunga, G.A.J, 2017, "Empirical Mode Decomposition Based Ensemble Deep Learning for Load Demand Time Series Forecasting", Applied Soft Computing, Vol. 54, pp. 246–255.
Ranjan, R., Patel, V. M., Chellappa, R. (2017). Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence.
Razzak, M.I., Naz, S., Zaib A., 2018, “Deep Learning for Medical Image Processing: Overview, Challenges and the Future”, In: Dey N., Ashour A., Borra S. (eds) Classification in BioApps. Lecture Notes in Computational Vision and Biomechanics, vol 26. Springer, Cham.
Salehi, Z., Sami, A., Ghiasi, M. (2014). Using feature generation from API calls for malware detection. Computer Fraud Security, 2014(9), 9–18.
Schultz, M. G., Eskin, E., Zadok, F., Stolfo, S.J., 2001, "Data Mining Methods for Detection of New Malicious Executables", In Security and Privacy, 2001. S&P 2001. Proceedings. 2001 IEEE Symposium on, IEEE, pp. 38–49.
Shabtai, A., Moskovitch, R., Elovici, Y., Glezer, C., 2009, "Detection of Malicious Code by Applying Machine Learning Classifiers On Static Features: A State-of-The-Art Survey", Information Security Technical Report, Vol. 14(1), pp. 16–29.
Siddiqui, S.A., Mercier, D., Munir, M., Dengel, A., Ahmed, S., 2018, "TSViz: Demystification of Deep Learning Models for Time-Series Analysis", arXiv preprint arXiv:1802.02952.
VirusShare, https://virusshare.com/, 04.02.2018.
VirusTotal, https://www.virustotal.com/#/home/upload, 15.03.2018.
W3schools, https://www.w3schools.com/browsers/browsers_os.asp, 25.05.2018
Wang, C., Pang, J., Zhao, R., Liu, X., "Using API Sequence and Bayes Algorithm to Detect Suspicious Behavior", In: Communication Software and Networks, 2009. ICCSN’09. International Conference on, IEEE, pp. 544–548, 2009.
Ye, Y., Li, T., Jiang, Q., Wang, Y., 2010, "CIMDS: Adapting Postprocessing Techniques of Associative Classification for Malware Detection", IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), Vol. 40(3), pp. 298–307.
Ye, Y., Wang, D., Li, T., Ye, D., Jiang, Q., 2008, "An Intelligent PE-Malware Detection System Based on Association Mining", Journal in Computer Virology, Vol. 4(4), pp. 323–334.
Young, T., Hazarika, D., Poria, S., Cambria, E., 2017, "Recent Trends İn Deep Learning Based Natural Language Processing", arXiv preprint arXiv:1708.02709.
Zatloukal, F., Znoj, J., 2017, "Malware Detection Based on Multiple PE Headers Identification and Optimization for Specific Types of Files". Journal of Advanced Engineering and Computation, Vol 1(2), pp. 153–161.
Ze, H., Senior, A., Schuster, M. (2013). Statistical parametric speech synthesis using deep neural networks. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on (ss. 7962–7966). IEEE.
Zeyer, A., Doetsch, P., Voigtlaender, P., Schlüter, R., Ney, H., "A Comprehensive Study of Deep Bidirectional LSTM Rnns For Acoustic Modeling in Speech Recognition", In Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on , pp. 2462-2466. IEEE, 2017.
Zhang, Y., Pezeshki, M., Brakel, P., Zhang, S., Bengio, C. L. Y., Courville, A., 2017, "Towards End-To-End Speech Recognition with Deep Convolutional Neural Networks", arXiv preprint arXiv:1701.02720.

There are 37 citations in total.

Details

Primary Language	Turkish
Subjects	Engineering
Journal Section	Research Articles
Authors	Mahmut Tokmak 0000-0003-0632-4308 Ecir Uğur Küçüksille 0000-0002-3293-9878
Publication Date	March 30, 2019
Acceptance Date	March 25, 2019
Published in Issue	Year 2019 Volume: 3 Issue: 1

Cite

APA	Tokmak, M., & Küçüksille, E. U. (2019). Kötü Amaçlı Windows Çalıştırılabilir Dosyalarının Derin Öğrenme İle Tespiti. Bilge International Journal of Science and Technology Research, 3(1), 67-76. https://doi.org/10.30516/bilgesci.531801

Cited By

An Empirical Comparison of Machine Learning Algorithms for Predicting Breast Cancer

Bilge International Journal of Science and Technology Research

https://doi.org/10.30516/bilgesci.645067

DERİN SİNİR AĞLARI VE YENİDEN ÖRNEKLEME METOTLARI İLE RUTİN KAN TESTLERİNE DAYALI COVID-19 TESPİTİ

Konya Journal of Engineering Sciences

https://doi.org/10.36306/konjes.877805

A Hybrid Method Based On A Genetic Algorithm That Uses Network Packets To Classify Spyware

Journal of Physical Chemistry and Functional Materials

https://doi.org/10.54565/jphcfum.1579687

Download Cover Image

Article Files

Full Text

This work is licensed under a Creative Commons Attribution 4.0 International License