Reinforcement Learning with Augmentation Invariant Representation: A Non-contrastive Approach

Published: 28 Oct 2023, Last Modified: 25 Dec 2023GenPlan'23EveryoneRevisionsBibTeX
Abstract: Data augmentation has been proven as an effective measure to improve generalization performance in reinforcement learning (RL). However, recent approaches directly use the augmented data to learn the value estimate or regularize the estimation, often ignoring the core essence that the model needs to learn that augmented data indeed represents the same state. In this work, we present RAIR: Reinforcement learning with Augmentation Invariant Representation that disentangles the representation learning task from the RL task and aims to learn similar latent representations for the original observation and the augmented one. Our approach learns the representation of high-dimensional visual observations in a non-contrastive self-supervised way combined with the standard RL objective. In particular, RAIR gradually pushes the latent representation of an observation closer to the representation produced for the corresponding augmented observations. Thus, our agent is more resilient to the changes in the environment. We evaluate RAIR on all sixteen environments from the RL generalization benchmark Procgen. The experimental results indicate that RAIR outperforms PPO and other data augmentation-based approaches under the standard evaluation protocol.
Submission Number: 88