- Keywords: Reinforcement Learning, Data Efficiency, Representation Learning
- Abstract: Learning informative representations from image-based observations is a fundamental problem in deep Reinforcement Learning (RL). However, data inefficiency remains a significant barrier. To this end, we investigate Predictive Consistent Representations (PCR) that enforces predictive consistency on a learned dynamic model. Unlike previous algorithms that simply exploit a forward dynamics model, the PCR agent is trained to predict the future state and retain consistency across the predicted state of observation and its multiple views, which is demonstrated through careful ablation experiments. We empirically show that PCR outperforms the current state-of-the-art baselines in terms of data efficiency on a series of pixel-based control tasks in the DeepMind control suite. Notably, on challenging tasks like Cheetah-run, PCR reaches a 47.4% improvement when environmental steps are limited to 100k steps.
- One-sentence Summary: we presented Predictive Consistency Representation (PCR), a self-supervised representation learning algorithm that significantly improves data efficiency for RL agents with visual inputs.
- Supplementary Material: zip