Adversarially Trained Neural Policies in the Fourier DomainDownload PDF

Published: 21 Jun 2021, Last Modified: 05 May 2023ICML 2021 Workshop AML PosterReaders: Everyone
Keywords: reinforcement learning, deep learning, adversarial, deep reinforcement learning, robustness, adversarial training, deep neural networks, safety
Abstract: Reinforcement learning policies based on deep neural networks are vulnerable to imperceptible adversarial perturbations to their inputs, in much the same way as neural network image classifiers. Recent work has proposed several methods for adversarial training for deep reinforcement learning agents to improve robustness to adversarial perturbations. In this paper, we study the effects of adversarial training on the neural policy learned by the agent. In particular, we compare the Fourier spectrum of minimal perturbations computed for both adversarially trained and vanilla trained neural policies. Via experiments in the OpenAI Atari environments we show that minimal perturbations computed for adversarially trained policies are more focused on lower frequencies in the Fourier domain, indicating a higher sensitivity of these policies to low frequency perturbations. We believe our results can be an initial step towards understanding the relationship between adversarial training and different notions of robustness for neural policies.
2 Replies

Loading