Planning from Pixels using Inverse Dynamics Models

Keiran Paster; Sheila A. McIlraith; Jimmy Ba

Planning from Pixels using Inverse Dynamics Models

Keiran Paster, Sheila A. McIlraith, Jimmy Ba

Published: 12 Jan 2021, Last Modified: 05 May 2023ICLR 2021 PosterReaders: Everyone

Keywords: model based reinforcement learning, deep reinforcement learning, multi-task learning, deep learning, goal-conditioned reinforcement learning

Abstract: Learning dynamics models in high-dimensional observation spaces can be challenging for model-based RL agents. We propose a novel way to learn models in a latent space by learning to predict sequences of future actions conditioned on task completion. These models track task-relevant environment dynamics over a distribution of tasks, while simultaneously serving as an effective heuristic for planning with sparse rewards. We evaluate our method on challenging visual goal completion tasks and show a substantial increase in performance compared to prior model-free approaches.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

One-sentence Summary: GLAMOR learns a latent world model by learning to predict action sequences conditioned on task completion.

Data: [DeepMind Control Suite](https://paperswithcode.com/dataset/deepmind-control-suite)

11 Replies

Loading