Planning from Pixels using Inverse Dynamics ModelsDownload PDF

Sep 28, 2020 (edited Mar 17, 2021)ICLR 2021 PosterReaders: Everyone
  • Keywords: model based reinforcement learning, deep reinforcement learning, multi-task learning, deep learning, goal-conditioned reinforcement learning
  • Abstract: Learning dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents. We propose a novel way to learn models in a latent space by learning to predict sequences of future actions conditioned on task completion. These models track task-relevant environment dynamics over a distribution of tasks, while simultaneously serving as an effective heuristic for planning with sparse rewards. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.
  • Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
  • One-sentence Summary: GLAMOR learns a latent world model by learning to predict action sequences conditioned on task completion.
11 Replies

Loading