Training with Worst-Case Distributional Shift causes Overestimation and Inaccuracies in State-Action Value FunctionsDownload PDF

29 Sept 2021 (modified: 13 Feb 2023)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone
Abstract: The utilization of deep neural networks as function approximators for the state-action value function created a new research area for self learning systems, and made it possible to learn optimal policies from high dimensional state representations. While this initial success led deep neural policies to be employed in many diverse disciplines with manifold applications, the issues related to their resilience with respect to specifically crafted imperceptible adversarial perturbations remains a concern. To eliminate these concerns several studies have focused on building deep neural policies resilient towards these perturbations via training with the presence of such perturbations (i.e. adversarial training). In this paper we focus on conducting an investigation on the state-action value function learned by state-of-the-art adversarially trained deep neural policies and vanilla trained deep neural policies. We theoretically motivate that the idea behind the state-of-the-art adversarial training method causes overestimation bias and inaccuracies in the state-action value function. We perform several experiments in the Arcade Learning Environment (ALE) and show that indeed adversarially trained deep neural policies suffer from overestimation bias. Furthermore, the state-action value functions learned by vanilla trained deep neural policies have more accurate estimates for the non-optimal actions than state-of-the-art adversarially trained deep neural policies. We believe our study lays out intriguing properties of adversarial training and could be a critical step towards obtaining robust and reliable policies.
Supplementary Material: zip
10 Replies

Loading