- Keywords: Representation learning, model-based reinforcement learning
- TL;DR: Combining forward and backward passes for more efficient representation learning for reinforcement learning by constraining the cycle-consistency constraint in the backward pass.
- Abstract: Representation learning is a popular approach for reinforcement learning (RL) tasks with partially observable Markov decision processes. Existing works on learning representations utilise the dynamics models in model-based RL to perform training through model predictive reconstruction in a temporally forward fashion. However, temporally backward state predictions also yield useful supervision signals as they convey information about the future states given the action choices. We argue that combining them with forward passes will facilitate stronger representation learning and improve the sample efficiency of RL. Here we propose a general framework for learning state representations for RL tasks, utilising both forward and backward passes by imposing temporal cycle-consistency constraints, which can be integrated with any model-based RL algorithms leveraging a latent dynamics model. We show improved empirical performance in terms of sample-efficiency and convergence score over several baselines on continuous control benchmarks.