Spectral Robustness Analysis of Deep Imitation LearningDownload PDF

Published: 05 Dec 2022, Last Modified: 05 May 2023MLSW2022Readers: Everyone
Abstract: Deep reinforcement learning algorithms enabled learning functioning policies in MDPs with complex state representations. Following these advancements deep reinforcement learning polices have been deployed in many diverse settings. However, a line of research argued that in certain settings building a reward function can be more complicated than learning it. Hence, several studies proposed different methods to learn a reward function by observing trajectories of a functioning policy (i.e. inverse reinforcement learning). Following this line of research several studies proposed to directly learn a functioning policy by solely observing trajectories of an expert (i.e. imitation learning). In this paper, we propose a novel method to analyze the spectral robustness of deep neural policies. We conduct several experiments in the Arcade Learning Environment, and demonstrate that simple vanilla trained deep reinforcement learning policies are more robust than deep imitation learning policies. We believe that our method provides a comprehensive analysis on the policy robustness and can help in understanding the fundamental properties of different training techniques.
1 Reply