Combating Adversaries with Anti-Adversaries

Motasem Alfarra; Juan Camilo Perez; Ali Thabet; Adel Bibi; Philip Torr; Bernard Ghanem

Combating Adversaries with Anti-Adversaries

Motasem Alfarra, Juan Camilo Perez, Ali Thabet, Adel Bibi, Philip Torr, Bernard Ghanem

Published: 21 Jun 2021, Last Modified: 30 Mar 2025ICML 2021 Workshop AML PosterReaders: Everyone

Keywords: Adversarial Attacks, Network Robustness

TL;DR: We propose an anti-adversary layer that increases network confidence towards the predicted label to enhance models robustness against a variety of attacks.

Abstract: Deep neural networks are vulnerable to small input perturbations known as adversarial attacks. Inspired by the fact that these adversaries are constructed by iteratively minimizing the confidence of a network for the true class label, we propose the anti-adversary layer, aimed at countering this effect. In particular, our layer generates an input perturbation in the opposite direction of the adversarial one and feeds the classifier a perturbed version of the input. Our approach is training-free and theoretically supported. We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models, and conduct large-scale experiments from black-box to adaptive attacks on CIFAR10, CIFAR100 and ImageNet. Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/combating-adversaries-with-anti-adversaries/code)

2 Replies

Loading