Comparison of Pattern Matching Techniques on Identification of Same Family Malware

Volume: 4 Number: 3 September 29, 2015
EN

Comparison of Pattern Matching Techniques on Identification of Same Family Malware

Abstract

Development in computing technology for the past decade has also given rise to threats against the users, particularly in form of malware. However, manual malware identification effort is being overwhelmed due to the sheer number of malware being created every day. Most of the malware are not exactly created from scratch; large numbers of them are byproducts of particular malware family. This means that same or slightly modified resolution can be applied to counter their threat. This paper analyzes string matching methods for identification of same family malware. We investigate and compare the effectiveness of three well-known pattern matching algorithms, namely Jaro, Lowest Common Subsequence (LCS), and N-Gram.  After researching these three algorithms we found out thresholds of 0.79 for Jaro, 0.79 for LCS, and 0.54 for N-Gram showed to be effective for string similarity detection between malware.

Index Terms— Jaro, Longest Common Subsequence, Malware Analysis, N-gram, String Similarity

Keywords

References

  1. Microsoft, "The evolution of malware and the threat landscape - a 10-year review: key findings," 2012, http://download.microsoft.com/download/1/A/7/ A76A73B-6C5B-41CF-9E8C
  2. F7709B870F/Microsoft-Security-Intelligence-Report
  3. Special-Edition-10-Year-Review-Key-Findings- Summary.pdf, Feb.2012 [Online; accessed September M.R.Islam , R.Tian, L.Batten, and S.Versteeg.
  4. "Classification of malware based on string and function feature selection." In Cybercrime and Trustworthy Computing Workshop (CTC), 2010 Second, pp. 9-17. IEEE, 2010.
  5. A.Walenstein, M.Venable, M.Hayes, C.Thompson, and A.Lakhotia. "Exploiting similarity between variants to defeat malware." In Proc. BlackHat DC Conf. 2007. K.Kendall, and C.McMillan. "Practical malware analysis." In Black Hat Conference, USA. 2007
  6. J.H.Park, M.Kim, B.Noh, and J.Joshi. "A Similarity based Technique for Detecting Malicious Executable files for Computer Forensics." In Information Reuse and Integration, 2006 IEEE International Conference on, pp. 193. IEEE, 2006.
  7. V.Levenshtein,"Binary codes capable of correcting deletions, insertions, and reversals". Soviet Physics Doklady 10 pp.707-710, USSR, 1966.
  8. J.Lee, C.Im, and H. Jeong. "A study of malware detection and classification by comparing extracted strings." In Proceedings of the 5th International Conference on Ubiquitous Communication, pp. 75. ACM, 2011 Management and A.Sulaiman, S.Mandada, S. Mukkamala, and A.Sung.

Details

Primary Language

English

Subjects

-

Journal Section

-

Authors

Ferdiansyah Mastjik This is me

Cihan Varol This is me

Publication Date

September 29, 2015

Submission Date

January 30, 2016

Acceptance Date

-

Published in Issue

Year 2015 Volume: 4 Number: 3

APA
Mastjik, F., Varol, C., & Varol, A. (2015). Comparison of Pattern Matching Techniques on Identification of Same Family Malware. International Journal of Information Security Science, 4(3), 104-111. https://izlik.org/JA53JD78GN
AMA
1.Mastjik F, Varol C, Varol A. Comparison of Pattern Matching Techniques on Identification of Same Family Malware. IJISS. 2015;4(3):104-111. https://izlik.org/JA53JD78GN
Chicago
Mastjik, Ferdiansyah, Cihan Varol, and Asaf Varol. 2015. “Comparison of Pattern Matching Techniques on Identification of Same Family Malware”. International Journal of Information Security Science 4 (3): 104-11. https://izlik.org/JA53JD78GN.
EndNote
Mastjik F, Varol C, Varol A (September 1, 2015) Comparison of Pattern Matching Techniques on Identification of Same Family Malware. International Journal of Information Security Science 4 3 104–111.
IEEE
[1]F. Mastjik, C. Varol, and A. Varol, “Comparison of Pattern Matching Techniques on Identification of Same Family Malware”, IJISS, vol. 4, no. 3, pp. 104–111, Sept. 2015, [Online]. Available: https://izlik.org/JA53JD78GN
ISNAD
Mastjik, Ferdiansyah - Varol, Cihan - Varol, Asaf. “Comparison of Pattern Matching Techniques on Identification of Same Family Malware”. International Journal of Information Security Science 4/3 (September 1, 2015): 104-111. https://izlik.org/JA53JD78GN.
JAMA
1.Mastjik F, Varol C, Varol A. Comparison of Pattern Matching Techniques on Identification of Same Family Malware. IJISS. 2015;4:104–111.
MLA
Mastjik, Ferdiansyah, et al. “Comparison of Pattern Matching Techniques on Identification of Same Family Malware”. International Journal of Information Security Science, vol. 4, no. 3, Sept. 2015, pp. 104-11, https://izlik.org/JA53JD78GN.
Vancouver
1.Ferdiansyah Mastjik, Cihan Varol, Asaf Varol. Comparison of Pattern Matching Techniques on Identification of Same Family Malware. IJISS [Internet]. 2015 Sep. 1;4(3):104-11. Available from: https://izlik.org/JA53JD78GN