- Keywords: model based reinforcement learning, deep reinforcement learning, multi-task learning, deep learning, goal-conditioned reinforcement learning
- Abstract: Learning dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents. We propose a novel way to learn models in a latent space by learning to predict sequences of future actions conditioned on task completion. These models track task-relevant environment dynamics over a distribution of tasks, while simultaneously serving as an effective heuristic for planning with sparse rewards. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.
- Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
- One-sentence Summary: GLAMOR learns a latent world model by learning to predict action sequences conditioned on task completion.