BibTex RIS Kaynak Göster

Comparison of Pattern Matching Techniques on Identification of Same Family Malware

Yıl 2015, Cilt: 4 Sayı: 3, 104 - 111, 29.09.2015

Öz

Development in computing technology for the past decade has also given rise to threats against the users, particularly in form of malware. However, manual malware identification effort is being overwhelmed due to the sheer number of malware being created every day. Most of the malware are not exactly created from scratch; large numbers of them are byproducts of particular malware family. This means that same or slightly modified resolution can be applied to counter their threat. This paper analyzes string matching methods for identification of same family malware. We investigate and compare the effectiveness of three well-known pattern matching algorithms, namely Jaro, Lowest Common Subsequence (LCS), and N-Gram.  After researching these three algorithms we found out thresholds of 0.79 for Jaro, 0.79 for LCS, and 0.54 for N-Gram showed to be effective for string similarity detection between malware.

Index Terms— Jaro, Longest Common Subsequence, Malware Analysis, N-gram, String Similarity

Kaynakça

  • Microsoft, "The evolution of malware and the threat landscape - a 10-year review: key findings," 2012, http://download.microsoft.com/download/1/A/7/ A76A73B-6C5B-41CF-9E8C
  • F7709B870F/Microsoft-Security-Intelligence-Report
  • Special-Edition-10-Year-Review-Key-Findings- Summary.pdf, Feb.2012 [Online; accessed September M.R.Islam , R.Tian, L.Batten, and S.Versteeg.
  • "Classification of malware based on string and function feature selection." In Cybercrime and Trustworthy Computing Workshop (CTC), 2010 Second, pp. 9-17. IEEE, 2010.
  • A.Walenstein, M.Venable, M.Hayes, C.Thompson, and A.Lakhotia. "Exploiting similarity between variants to defeat malware." In Proc. BlackHat DC Conf. 2007. K.Kendall, and C.McMillan. "Practical malware analysis." In Black Hat Conference, USA. 2007
  • J.H.Park, M.Kim, B.Noh, and J.Joshi. "A Similarity based Technique for Detecting Malicious Executable files for Computer Forensics." In Information Reuse and Integration, 2006 IEEE International Conference on, pp. 193. IEEE, 2006.
  • V.Levenshtein,"Binary codes capable of correcting deletions, insertions, and reversals". Soviet Physics Doklady 10 pp.707-710, USSR, 1966.
  • J.Lee, C.Im, and H. Jeong. "A study of malware detection and classification by comparing extracted strings." In Proceedings of the 5th International Conference on Ubiquitous Communication, pp. 75. ACM, 2011 Management and A.Sulaiman, S.Mandada, S. Mukkamala, and A.Sung.
  • "Similarity Analysis of Malicious Executables." In Proceedings of the 2nd International Conference on Information Warfare & Security, pp. 225. Academic Conferences Limited, 2007.
  • M.Jaro. “Advances in record linkage methodology as applied to the 1985 census of Tampa Florida,” In 84th
  • Journal of the American Statistical Association, pp.414- , 1989.
  • L. Bergroth, H. Hakonen and T. Raita. “A Survey of Longest Common Subsequence Algorithms” In SPIRE (IEEE Computer Society), pp.39-48, 2000.
  • D.Plohman. “Portable Executable 101 - a windows executable https://code.google.com/p/corkami/wiki/PE101?show=co ntent, Aug.2014[Online, accessed August 2014] Internet:
  • DG Altman and JM Bland. “Diagnostic tests. 1 :Sensitivity and specificity”, In 38th Business Medical Journal ,1994.
  • D.Olson and D.Delen, Advanced Data Mining Techniques, 1st ed, Springer, 2008, pp.138.
Yıl 2015, Cilt: 4 Sayı: 3, 104 - 111, 29.09.2015

Öz

Kaynakça

  • Microsoft, "The evolution of malware and the threat landscape - a 10-year review: key findings," 2012, http://download.microsoft.com/download/1/A/7/ A76A73B-6C5B-41CF-9E8C
  • F7709B870F/Microsoft-Security-Intelligence-Report
  • Special-Edition-10-Year-Review-Key-Findings- Summary.pdf, Feb.2012 [Online; accessed September M.R.Islam , R.Tian, L.Batten, and S.Versteeg.
  • "Classification of malware based on string and function feature selection." In Cybercrime and Trustworthy Computing Workshop (CTC), 2010 Second, pp. 9-17. IEEE, 2010.
  • A.Walenstein, M.Venable, M.Hayes, C.Thompson, and A.Lakhotia. "Exploiting similarity between variants to defeat malware." In Proc. BlackHat DC Conf. 2007. K.Kendall, and C.McMillan. "Practical malware analysis." In Black Hat Conference, USA. 2007
  • J.H.Park, M.Kim, B.Noh, and J.Joshi. "A Similarity based Technique for Detecting Malicious Executable files for Computer Forensics." In Information Reuse and Integration, 2006 IEEE International Conference on, pp. 193. IEEE, 2006.
  • V.Levenshtein,"Binary codes capable of correcting deletions, insertions, and reversals". Soviet Physics Doklady 10 pp.707-710, USSR, 1966.
  • J.Lee, C.Im, and H. Jeong. "A study of malware detection and classification by comparing extracted strings." In Proceedings of the 5th International Conference on Ubiquitous Communication, pp. 75. ACM, 2011 Management and A.Sulaiman, S.Mandada, S. Mukkamala, and A.Sung.
  • "Similarity Analysis of Malicious Executables." In Proceedings of the 2nd International Conference on Information Warfare & Security, pp. 225. Academic Conferences Limited, 2007.
  • M.Jaro. “Advances in record linkage methodology as applied to the 1985 census of Tampa Florida,” In 84th
  • Journal of the American Statistical Association, pp.414- , 1989.
  • L. Bergroth, H. Hakonen and T. Raita. “A Survey of Longest Common Subsequence Algorithms” In SPIRE (IEEE Computer Society), pp.39-48, 2000.
  • D.Plohman. “Portable Executable 101 - a windows executable https://code.google.com/p/corkami/wiki/PE101?show=co ntent, Aug.2014[Online, accessed August 2014] Internet:
  • DG Altman and JM Bland. “Diagnostic tests. 1 :Sensitivity and specificity”, In 38th Business Medical Journal ,1994.
  • D.Olson and D.Delen, Advanced Data Mining Techniques, 1st ed, Springer, 2008, pp.138.
Toplam 15 adet kaynakça vardır.

Ayrıntılar

Birincil Dil İngilizce
Bölüm Makaleler
Yazarlar

Ferdiansyah Mastjik Bu kişi benim

Cihan Varol Bu kişi benim

Asaf Varol

Yayımlanma Tarihi 29 Eylül 2015
Gönderilme Tarihi 30 Ocak 2016
Yayımlandığı Sayı Yıl 2015 Cilt: 4 Sayı: 3

Kaynak Göster

IEEE F. Mastjik, C. Varol, ve A. Varol, “Comparison of Pattern Matching Techniques on Identification of Same Family Malware”, IJISS, c. 4, sy. 3, ss. 104–111, 2015.