Behavior Predictive Representations for Generalization in Reinforcement Learning

Siddhant Agarwal; Aaron Courville; Rishabh Agarwal

Behavior Predictive Representations for Generalization in Reinforcement Learning

Siddhant Agarwal, Aaron Courville, Rishabh Agarwal

12 Oct 2021 (modified: 05 May 2023)Deep RL Workshop NeurIPS 2021Readers: Everyone

Keywords: Representation Learning, Generalization, Model Based Reinforcement Learning Learning

Abstract: Deep reinforcement learning (RL) agents trained on a few environments, often struggle to generalize on unseen environments, even when such environments are semantically equivalent to training environments. Such agents learn representations that overfit the characteristics of the training environments. We posit that generalization can be improved by assigning similar representations to scenarios with similar sequences of long-term optimal behavior. To do so, we propose behavior predictive representations (BPR) that capture long-term optimal behavior. BPR trains an agent to predict latent state representations multiple steps into the future such that these representations can predict the optimal behavior at the future steps. We demonstrate that BPR provides large gains on a jumping task from pixels, a problem designed to test generalization.

0 Replies

Loading