A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring
Abstract
Deep Neural Networks (DNNs) have attained remarkable prediction outcomes in image classification tasks, leading to significant progress in computer vision applications. However, the presence of adversarial examples has emerged as a critical challenge to the robustness and efficiency of deep learning-based image classifiers. Adversarial examples are specially designed perturbations applied to input images to deceive the models into generating inaccurate predictions while appearing indistinguishable to human observers. In this paper, we present a defense mechanism, namely Defensive Distillation with Gaussian Blurring (DDGB), that improves the robustness of deep learning models towards adversarial attacks. First, two models, a teacher and a student model, were utilized to train and validate the presented approach. The teacher model is trained and then leveraged to determine softened probabilities, which are later utilized to train the student model. Second, a feature-squeezing technique based on Gaussian blurring is applied to the adversarial examples generated from the distilled student model as a form of defense mechanism to make the adversarial perturbations less effective. The obtained findings demonstrate that the proposed approach is effective in improving the performance, achieving classification accuracies of 87.61% and 87.48% using the Fast Gradient Sign Method (FGSM) and Basic Iterative Method (BIM) attacks, respectively, based on the CIFAR-10 dataset. In summary, the presented approach achieves a 70.66% reduction in computations for the student model, allowing the model to be deployed on devices with limited resources and provide improved prediction accuracy towards adversarial attacks.
Keywords
References
- J. Gu et al., "A survey on transferability of adversarial examples across deep neural networks," arXiv preprint arXiv:2310.17626, Oct. 2023, doi:10.48550/arXiv.2310.17626.
- C. Eleftheriadis, A. Symeonidis, and P. Katsaros, "Adversarial robustness improvement for deep neural networks," Mach. Vis. Appl., vol. 35, no. 3, Art. no. 35, May. 2024, doi: 10.1007/s00138-024-01519-1.
- J. Ling, J. Chen, and H. Li, "FDT: Improving the transferability of adversarial examples with frequency domain transformation," Comput. Secur., vol. 144, no. 1, Art. no. 103942, Jun. 2024, doi: 10.1016/j.cose.2024.103942
- P. Tian, S. Poreddy, C. Danda, C. Gowrineni, Y. Wu, and W. Liao, "Evaluating impact of image transformations on adversarial examples," IEEE Access, vol. 12, pp. 186217–186228, Oct. 2024.
- J. Liu, H. Liu, P. Wang, Y. Wu, and K. Li, "DIPA: Adversarial attack on DNNs by dropping information and pixel-level attack on attention,” Inf. (Switz.), vol. 15, no. 7, Art. no. 391, Jul. 2024, doi: 10.3390/info15070391
- D. Wu, S. T. Xia, and Y. Wang, "Adversarial weight perturbation helps robust generalization," Adv. Neural Inf. Process. Syst., vol. 33, pp. 2958–2969, 2020, doi: 10.48550/arXiv.2004.05884.
- K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 770–778, doi: 10.1109/CVPR.2016.90
- A. Chistyakova et al., "Increasing the robustness of image quality assessment models through adversarial training," Technologies, vol. 12, no. 11, pp. 220, Nov. 2024, doi: 10.3390/technologies12110220.
Details
Primary Language
English
Subjects
Software Engineering (Other)
Journal Section
Research Article
Authors
Early Pub Date
May 11, 2026
Publication Date
June 17, 2026
Submission Date
June 18, 2025
Acceptance Date
January 1, 2026
Published in Issue
Year 2026 Volume: 9 Number: 2
