The rapid evolution of malware presents significant challenges in cybersecurity. This study investigates the efficacy of various machine learning and ensemble learning models for malware detection using dynamic analysis. The dynamic datasets, contain API calls and permissions, enabling real-time monitoring of malware behavior. In conclusion, for both the VirusSample and VirusShare datasets, the random forest (RF) model achieved the best results among machine learning models, with accuracies of %94.69 and %85.72, respectively. For the VirusSample dataset, the stacking ensemble learning model, which uses RF and decision trees (DT) as base classifiers and K-nearest neighbors (KNN) as the meta classifier, achieved the highest accuracy of %94.52. In contrast, for the VirusShare dataset, the stacking ensemble learning model, which uses RF, KNN, and gradient boosting (GB) as base classifiers and support vector machine (SVM) as the meta classifier, achieved the highest accuracy of %85.7. These results underscore the superiority of dynamic analysis and the effectiveness of ensemble methods in enhancing malware detection accuracy. This study contributes to the optimization of machine learning models and the advancement of cybersecurity solutions.
information security malware detection dynamic analysis machine learning ensemble learning.
Primary Language | English |
---|---|
Subjects | System and Network Security, Data Security and Protection |
Journal Section | Research Article |
Authors | |
Publication Date | December 29, 2024 |
Submission Date | July 4, 2024 |
Acceptance Date | October 3, 2024 |
Published in Issue | Year 2024 Volume: 13 Issue: 4 |