Planning from Pixels using Inverse Dynamics ModelsDownload PDF

Published: 12 Jan 2021, Last Modified: 05 May 2023ICLR 2021 PosterReaders: Everyone
Keywords: model based reinforcement learning, deep reinforcement learning, multi-task learning, deep learning, goal-conditioned reinforcement learning
Abstract: Learning dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents. We propose a novel way to learn models in a latent space by learning to predict sequences of future actions conditioned on task completion. These models track task-relevant environment dynamics over a distribution of tasks, while simultaneously serving as an effective heuristic for planning with sparse rewards. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: GLAMOR learns a latent world model by learning to predict action sequences conditioned on task completion.
Data: [DeepMind Control Suite](
11 Replies