A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring

Ahmed Ahmed; Mohammed Al-nuaimi; Faris Alghareb; Ahmad Abdulfattah

doi:10.35377/saucis...1721392

A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring

Abstract

Deep Neural Networks (DNNs) have attained remarkable prediction outcomes in image classification tasks, leading to significant progress in computer vision applications. However, the presence of adversarial examples has emerged as a critical challenge to the robustness and efficiency of deep learning-based image classifiers. Adversarial examples are specially designed perturbations applied to input images to deceive the models into generating inaccurate predictions while appearing indistinguishable to human observers. In this paper, we present a defense mechanism, namely Defensive Distillation with Gaussian Blurring (DDGB), that improves the robustness of deep learning models towards adversarial attacks. First, two models, a teacher and a student model, were utilized to train and validate the presented approach. The teacher model is trained and then leveraged to determine softened probabilities, which are later utilized to train the student model. Second, a feature-squeezing technique based on Gaussian blurring is applied to the adversarial examples generated from the distilled student model as a form of defense mechanism to make the adversarial perturbations less effective. The obtained findings demonstrate that the proposed approach is effective in improving the performance, achieving classification accuracies of 87.61% and 87.48% using the Fast Gradient Sign Method (FGSM) and Basic Iterative Method (BIM) attacks, respectively, based on the CIFAR-10 dataset. In summary, the presented approach achieves a 70.66% reduction in computations for the student model, allowing the model to be deployed on devices with limited resources and provide improved prediction accuracy towards adversarial attacks.

Keywords

References

J. Gu et al., "A survey on transferability of adversarial examples across deep neural networks," arXiv preprint arXiv:2310.17626, Oct. 2023, doi:10.48550/arXiv.2310.17626.
C. Eleftheriadis, A. Symeonidis, and P. Katsaros, "Adversarial robustness improvement for deep neural networks," Mach. Vis. Appl., vol. 35, no. 3, Art. no. 35, May. 2024, doi: 10.1007/s00138-024-01519-1.
J. Ling, J. Chen, and H. Li, "FDT: Improving the transferability of adversarial examples with frequency domain transformation," Comput. Secur., vol. 144, no. 1, Art. no. 103942, Jun. 2024, doi: 10.1016/j.cose.2024.103942
P. Tian, S. Poreddy, C. Danda, C. Gowrineni, Y. Wu, and W. Liao, "Evaluating impact of image transformations on adversarial examples," IEEE Access, vol. 12, pp. 186217–186228, Oct. 2024.
J. Liu, H. Liu, P. Wang, Y. Wu, and K. Li, "DIPA: Adversarial attack on DNNs by dropping information and pixel-level attack on attention,” Inf. (Switz.), vol. 15, no. 7, Art. no. 391, Jul. 2024, doi: 10.3390/info15070391
D. Wu, S. T. Xia, and Y. Wang, "Adversarial weight perturbation helps robust generalization," Adv. Neural Inf. Process. Syst., vol. 33, pp. 2958–2969, 2020, doi: 10.48550/arXiv.2004.05884.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2016, pp. 770–778, doi: 10.1109/CVPR.2016.90
A. Chistyakova et al., "Increasing the robustness of image quality assessment models through adversarial training," Technologies, vol. 12, no. 11, pp. 220, Nov. 2024, doi: 10.3390/technologies12110220.

X. Ma et al., "Understanding adversarial attacks on deep learning based medical image analysis systems," Pattern Recognition, vol. 110, Art. no. 107332, Feb. 2021, doi: 10.1016/j.patcog.2020.107332
I. J. Goodfellow, J. Shlens, and C. Szegedy, "Explaining and harnessing adversarial examples," arXiv preprint arXiv:1412.6572, Dec. 2014, doi: 10.48550/arXiv.1412.6572.
A. Kurakin, I. Goodfellow, and S. Bengio, "Adversarial machine learning at scale," arXiv preprint arXiv:1611.01236, Nov. 2016, doi:10.48550/arXiv.1611.01236.
A. Madry, A. Makelov, L. Schmidt, D. Tsipras, and A. Vladu, "Towards deep learning models resistant to adversarial attacks," arXiv preprint arXiv:1706.06083, Jun. 2017, doi:10.48550/arXiv.1706.06083.
S. Wang, S. Chen, T. Chen, S. Nepal, C. Rudolph, and M. Grobler, "Generating semantic adversarial examples via feature manipulation in latent space," IEEE Trans. Neural Netw. Learn. Syst., vol. 35, no. 12, pp. 17070–17084, Aug. 2023.
J. Wang, C. Wang, Q. Lin, C. Luo, C. Wu, and J. Li, "Adversarial attacks and defenses in deep learning for image recognition: A survey," Neurocomputing, vol. 514, pp. 162-181, Dec. 2022, doi: 10.1016/j.neucom.2022.09.004.
E. K. Gulsoy, S. Ayas, E. B. Kablan, and M. Ekinci, "Enhancing the adversarial robustness in medical image classification: Exploring adversarial machine learning with vision transformers-based models," Neural Comput. Appl., vol. 37, no. 12, pp. 7971–7989, Apr. 2025, doi: 10.1007/s00521-024-10516-4.
F. Demircan, M. Ekinci, Z. Cömert, and E. Gedikli, "Enhanced classification of ear disease images using metaheuristic feature selection," Sakarya Univ. J. Comput. Inf. Sci., vol. 8, no. 1, pp. 58–75, Mar. 2025, doi: 10.35377/saucis...1579003.
A. D. M. Ibrahum, M. Hussain, and J. E. Hong, "Deep learning adversarial attacks and defenses in autonomous vehicles: a systematic literature review from a safety perspective," Artif. Intell. Rev., vol. 58, no. 1, Art. no. 28, Jan. 2025.
A. Ahmed and F. S. Alghareb, "A hybrid ROI extraction approach for mask and unmask facial recognition system using Light-CNN," Int. J. Comput. Digit. Syst., vol. 15, no. 1, pp. 1223–1232, Mar. 2024, doi: 10.12785/ijcds/160190.
H. Yang, J. Yu, and R. Zhai, "High-precision intrusion detection for cybersecurity communications based on multi-scale convolutional neural networks," J. Supercomput., vol. 81, Art. no. 277, Jan. 2025, doi: 10.1007/s11227-024-06737-y.
G. W. Ding, K.Y. Lui, X. Jin, L. Wang, and R. Huang, "On the sensitivity of adversarial robustness to input data distributions," in Proc. Int. Conf. Learn. Represent. (ICLR), Feb. 2019, doi: 10.48550/arXiv.1902.08336.
H. Zhang, Y. Yu, J. Jiao, E. P. Xing, L. El Ghaoui, and M. I. Jordan, “Theoretically principled trade-off between robustness and accuracy,” in Proc. Int. Conf. Mach. Learn. (ICML), May 2019, pp. 7472–7482, doi: 10.48550/arXiv.1901.08573.
Y. Wang, D. Zou, J. Yi, J. Bailey, X. Ma, and Q. Gu, "Improving adversarial robustness requires revisiting misclassified examples," in Proc. Int. Conf. Learn. Represent. (ICLR), Sep. 2019.
G. Jin, S. Shen, D. Zhang, F. Dai, and Y. Zhang, “APE-GAN: Adversarial perturbation elimination with GAN,” in Proc. IEEE Int. Conf. Acoust., Speech Signal Process. (ICASSP), May 2019, pp. 3842–3846, doi: 10.48550/arXiv.1707.05474.
M. Naseer, S. Khan, M. Hayat, F. S. Khan, and F. Porikli, “A self-supervised approach for adversarial robustness,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2020, pp. 262–271, doi: 10.48550/arXiv.2006.04924.
D. Zhou, T. Liu, B. Han, N. Wang, C. Peng, and X. Gao, "Towards defending against adversarial examples via attack-invariant features," in Proc. Int. Conf. Mach. Learn. (ICML), Jul. 2021, pp. 12835–12845, doi: 10.48550/arXiv.2106.05036.
W. Xu, D. Evans, and Y. Qi, “Feature squeezing: Detecting adversarial examples in deep neural networks,” in Proc. Netw. Distrib. Syst. Secur. Symp. (NDSS), Apr. 2017.
S. Y. Khamaiseh, D. Bagagem, A. Al-Alaj, M. Mancino, H. Alomari, and A. Aleroud, "Target-X: An efficient algorithm for generating targeted adversarial images to fool neural networks," in Proc. IEEE 47th Annu. Comput., Softw., Appl. Conf. (COMPSAC), Jun. 2023, pp. 617–626.
H. Kannan, A. Kurakin, and I. J. Goodfellow, “Adversarial logit pairing,” arXiv preprint arXiv:1803.06373, Mar. 2018, doi: 10.48550/arXiv.1803.06373.
A. Athalye, N. Carlini, and D. Wagner, "Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples," in Proc. Int. Conf. Mach. Learn. (ICML), Jul. 2018, pp. 274–283, doi: 10.48550/arXiv.1802.00420.
N. Papernot, P. McDaniel, I. J. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical black-box attacks against machine learning,” in Proc. ACM Asia Conf. Comput. Commun. Secur. (Asia CCS), Apr. 2017, pp. 506–519, doi: 10.1145/3052973.3053009.
Y. Dong, F. Liao, T. Pang, H. Su, J. Zhu, X. Hu, and J. Li, "Boosting adversarial attacks with momentum," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2018, pp. 9185–9193.
Y. Yang, P. Huang, J. Cao, J. Li, Y. Lin, and F. Ma, "A prompt-based approach to adversarial example generation and robustness enhancement," Front. Comput. Sci., vol. 18, no. 4, Art. no. 184318, Aug. 2024, doi: 10.1007/s11704-023-2639-2.
T. Dai, Y. Feng, B. Chen, J. Lu, and S.T. Xia, "Deep image prior based defense against adversarial examples," Pattern Recognit., vol. 122, Art. no. 108249, Feb. 2022, doi: 10.1016/j.patcog.2021.108249.
A. Kherchouche, S. A. Fezza, and W. Hamidouche, “Detect and defense against adversarial examples in deep learning using natural scene statistics and adaptive denoising,” Neural Comput. Appl., 2021, doi: 10.1007/s00521-021-06330-x.
Z. Zhou et al., "Securely fine-tuning pretrained encoders against adversarial examples," in Proc. IEEE Symp. Secur. Privacy (SP), May 2024, pp. 3015–3033, doi: 10.48550/arXiv.2403.10801.
Y. Bakhti, S. A. Fezza, W. Hamidouche, and O. Déforges, “DDSA: A defense against adversarial attacks using deep denoising sparse autoencoder,” IEEE Access, vol. 7, pp. 160397–160407, Nov. 2019.
C. Guo, M. Rana, M. Cisse, and L. van der Maaten, “Countering adversarial images using input transformations,” in Proc. Int. Conf. Learn. Represent. (ICLR), Oct. 2017, doi: 10.48550/arXiv.1711.00117.
C. Xie, J. Wang, Z. Zhang, Z. Ren, and A. Yuille, “Mitigating adversarial effects through randomization,” in Proc. Int. Conf. Learn. Represent. (ICLR), Nov. 2017, doi: 10.48550/arXiv.1711.01991.
X. Jia, X. Wei, X. Cao, and H. Foroosh, “ComDefend: An efficient image compression model to defend adversarial examples,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 6084–6092, doi: 10.48550/arXiv.1811.12673.
C. Xie, Y. Wu, L. van der Maaten, A. L. Yuille, and K. He, “Feature denoising for improving adversarial robustness,” in Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), Jun. 2019, pp. 501–509, doi: 10.48550/arXiv.1812.03411.
J. Yuan and Z. He, "Adversarial dual network learning with randomized image transform for restoring attacked images," IEEE Access, vol. 8, pp. 22617–22624, Jan. 2020, doi: 10.1109/ACCESS.2020.2969288.
Z. Xue, H. Wang, Y. Qin, and R. Pedarsani, "Conflict-aware adversarial training," arXiv preprint arXiv:2410.16579, Oct. 2024, doi: 10.48550/arXiv.2410.16579.
A. Kurakin, I. J. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” in Artificial Intelligence Safety and Security, R. V. Yampolskiy, Ed. Boca Raton, FL, USA: Chapman & Hall/CRC, Jul. 2018, pp. 99–112, doi: 10.48550/arXiv.1607.02533.
A. Krizhevsky and G. Hinton, “Learning multiple layers of features from tiny images,” Univ. Toronto, Toronto, ON, Canada, Tech. Rep., Apr. 2009.
F. Tramèr, A. Kurakin, N. Papernot, I. J. Goodfellow, D. Boneh, and P. McDaniel, “Ensemble adversarial training: Attacks and defenses,” in Proc. Int. Conf. Learn. Represent. (ICLR), May 2017, doi: 10.48550/arXiv.1705.07204.
H. Wang and C.-N. Yu, “A direct approach to robust deep learning using adversarial networks,” in Proc. Int. Conf. Learn. Represent. (ICLR), May 2019, doi: 10.48550/arXiv.1905.09591.

Details

Primary Language

English

Subjects

Software Engineering (Other)

Journal Section

Research Article

Authors

Ahmed Ahmed ^*
0000-0002-9852-1570
Iraq

Mohammed Al-nuaimi
0000-0001-7819-4974
Iraq

Faris Alghareb
0000-0001-6564-4008
Iraq

Ahmad Abdulfattah
0000-0002-9285-3400
Iraq

Early Pub Date

May 11, 2026

Publication Date

June 17, 2026

Submission Date

June 18, 2025

Acceptance Date

January 1, 2026

Published in Issue

Year 2026 Volume: 9 Number: 2

DOI

https://doi.org/10.35377/saucis...1721392

IZ

https://izlik.org/JA34HU29PS

Cite

RIS / Bibtex

APA

Ahmed, A., Al-nuaimi, M., Alghareb, F., & Abdulfattah, A. (2026). A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring. Sakarya University Journal of Computer and Information Sciences, 9(2), 336-348. https://doi.org/10.35377/saucis...1721392

AMA

1.Ahmed A, Al-nuaimi M, Alghareb F, Abdulfattah A. A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring. SAUCIS. 2026;9(2):336-348. doi:10.35377/saucis.1721392

Chicago

Ahmed, Ahmed, Mohammed Al-nuaimi, Faris Alghareb, and Ahmad Abdulfattah. 2026. “A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring”. Sakarya University Journal of Computer and Information Sciences 9 (2): 336-48. https://doi.org/10.35377/saucis. 1721392.

EndNote

Ahmed A, Al-nuaimi M, Alghareb F, Abdulfattah A (June 1, 2026) A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring. Sakarya University Journal of Computer and Information Sciences 9 2 336–348.

IEEE

[1]A. Ahmed, M. Al-nuaimi, F. Alghareb, and A. Abdulfattah, “A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring”, SAUCIS, vol. 9, no. 2, pp. 336–348, June 2026, doi: 10.35377/saucis...1721392.

ISNAD

Ahmed, Ahmed - Al-nuaimi, Mohammed - Alghareb, Faris - Abdulfattah, Ahmad. “A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring”. Sakarya University Journal of Computer and Information Sciences 9/2 (June 1, 2026): 336-348. https://doi.org/10.35377/saucis. 1721392.

JAMA

1.Ahmed A, Al-nuaimi M, Alghareb F, Abdulfattah A. A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring. SAUCIS. 2026;9:336–348.

MLA

Ahmed, Ahmed, et al. “A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring”. Sakarya University Journal of Computer and Information Sciences, vol. 9, no. 2, June 2026, pp. 336-48, doi:10.35377/saucis. 1721392.

Vancouver

1.Ahmed Ahmed, Mohammed Al-nuaimi, Faris Alghareb, Ahmad Abdulfattah. A Defensive Distillation Approach Towards Adversarial Attacks Based on Gaussian Blurring. SAUCIS. 2026 Jun. 1;9(2):336-48. doi:10.35377/saucis. 1721392