Research Article

Perturbation Augmentation for Adversarial Training with Diverse Attacks

Volume: 11 Number: 2 June 29, 2024
EN

Perturbation Augmentation for Adversarial Training with Diverse Attacks

Abstract

Adversarial Training (AT) aims to alleviate the vulnerability of deep neural networks to adversarial perturbations. However, the AT techniques struggle to maintain the performance on natural samples while improving the deep model’s robustness. The absence of perturbation diversity in generated during the adversarial training degrades the generalizability of the robust models, causing overfitting to particular perturbations and a decrease in natural performance. This study proposes an adversarial training framework that augments adversarial directions from a single-step attack to address the trade-off between robustness and generalization. Inspired by feature scattering adversarial training, the proposed framework computes a principal adversarial direction with a single-step attack that finds a perturbation disrupting the inter-sample relationships in the mini-batch during adversarial training. The principal direction obtained at each iteration is augmented by sampling new adversarial directions within a region spanning 45 degrees from the principal adversarial direction. The proposed adversarial training approach does not require extra backpropagation steps in adversarial direction augmentation. Therefore, generalization of the robust model is improved without posing an additional burden on the feature scattering adversarial training. Experiments on CIFAR-10, CIFAR-100, SVHN, Tiny-ImageNet, and The German Traffic Sign Recognition Benchmark consistently improve the accuracy on adversarial with an almost pristine natural performance.

Keywords

References

  1. Alzantot, M., Sharma, Y., Elgohary, A., Ho, B., Srivastava, M., & Chang, K. (2018). Generating natural language adversarial examples. In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, (pp. 2890–2896).
  2. Andriushchenko, M., & Flammarion, N. (2020). Understanding and improving fast adversarial training. In: Proceedings of Advances in Neural Information Processing Systems, 33, (pp. 16048-16059).
  3. Athalye, A., Carlini, N., & Wagner, D. (2018, July). Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In: International Conference on Machine Learning (pp. 274-283).
  4. Baytaş, İ. M., & Deb, D. (2023). Robustness-via-synthesis: Robust training with generative adversarial perturbations. Neurocomputing, 516, 49-60. https://doi.org/10.1016/j.neucom.2022.10.034
  5. Carlini, N., Mishra, P., Vaidya, T., Zhang, Y., Sherr, M., Shields, C., ... & Zhou, W. (2016). Hidden voice commands. In: 25th USENIX security symposium (USENIX security 16), (pp. 513-530).
  6. Carlini, N., & Wagner, D. (2017, May). Towards evaluating the robustness of neural networks. In: Proceedings of the IEEE Symposium on Security and Privacy. (pp. 39-57).
  7. Cuturi, M. (2013). Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26.
  8. Etmann, C., Lunz, S., Maass, P., & Schönlieb, C. B. (2019). On the connection between adversarial robustness and saliency map interpretability. In: Proceedings of the 36th International Conference on Machine Learning, 97, (pp. 1823-1832).

Details

Primary Language

English

Subjects

Deep Learning

Journal Section

Research Article

Early Pub Date

June 4, 2024

Publication Date

June 29, 2024

Submission Date

March 26, 2024

Acceptance Date

May 21, 2024

Published in Issue

Year 2024 Volume: 11 Number: 2

APA
Serbes, D., & Baytaş, İ. M. (2024). Perturbation Augmentation for Adversarial Training with Diverse Attacks. Gazi University Journal of Science Part A: Engineering and Innovation, 11(2), 274-288. https://doi.org/10.54287/gujsa.1458880
AMA
1.Serbes D, Baytaş İM. Perturbation Augmentation for Adversarial Training with Diverse Attacks. GU J Sci, Part A. 2024;11(2):274-288. doi:10.54287/gujsa.1458880
Chicago
Serbes, Duygu, and İnci M. Baytaş. 2024. “Perturbation Augmentation for Adversarial Training With Diverse Attacks”. Gazi University Journal of Science Part A: Engineering and Innovation 11 (2): 274-88. https://doi.org/10.54287/gujsa.1458880.
EndNote
Serbes D, Baytaş İM (June 1, 2024) Perturbation Augmentation for Adversarial Training with Diverse Attacks. Gazi University Journal of Science Part A: Engineering and Innovation 11 2 274–288.
IEEE
[1]D. Serbes and İ. M. Baytaş, “Perturbation Augmentation for Adversarial Training with Diverse Attacks”, GU J Sci, Part A, vol. 11, no. 2, pp. 274–288, June 2024, doi: 10.54287/gujsa.1458880.
ISNAD
Serbes, Duygu - Baytaş, İnci M. “Perturbation Augmentation for Adversarial Training With Diverse Attacks”. Gazi University Journal of Science Part A: Engineering and Innovation 11/2 (June 1, 2024): 274-288. https://doi.org/10.54287/gujsa.1458880.
JAMA
1.Serbes D, Baytaş İM. Perturbation Augmentation for Adversarial Training with Diverse Attacks. GU J Sci, Part A. 2024;11:274–288.
MLA
Serbes, Duygu, and İnci M. Baytaş. “Perturbation Augmentation for Adversarial Training With Diverse Attacks”. Gazi University Journal of Science Part A: Engineering and Innovation, vol. 11, no. 2, June 2024, pp. 274-88, doi:10.54287/gujsa.1458880.
Vancouver
1.Duygu Serbes, İnci M. Baytaş. Perturbation Augmentation for Adversarial Training with Diverse Attacks. GU J Sci, Part A. 2024 Jun. 1;11(2):274-88. doi:10.54287/gujsa.1458880