Dream to Control: Learning Behaviors by Latent Imagination

Danijar Hafner; Timothy Lillicrap; Jimmy Ba; Mohammad Norouzi

Dream to Control: Learning Behaviors by Latent Imagination

Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi

Published: 20 Dec 2019, Last Modified: 22 Jun 2025ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: We present Dreamer, an agent that learns long-horizon behaviors purely by latent imagination using analytic value gradients.

Abstract: Learned world models summarize an agent's experience to facilitate learning complex behaviors. While learning world models from high-dimensional sensory inputs is becoming feasible through deep learning, there are many potential ways for deriving behaviors from them. We present Dreamer, a reinforcement learning agent that solves long-horizon tasks from images purely by latent imagination. We efficiently learn behaviors by propagating analytic gradients of learned state values back through trajectories imagined in the compact state space of a learned world model. On 20 challenging visual control tasks, Dreamer exceeds existing approaches in data-efficiency, computation time, and final performance.

Keywords: world model, latent dynamics, imagination, planning by backprop, policy optimization, planning, reinforcement learning, control, representations, latent variable model, visual control, value function

Code: https://danijar.com/dreamer

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 8 code implementations](https://www.catalyzex.com/paper/dream-to-control-learning-behaviors-by-latent/code)

Original Pdf: pdf

15 Replies

Loading