Model-Based Reinforcement Learning via Latent-Space Collocation

Oleh Rybkin; Chuning Zhu; Anusha Nagabandi; Kostas Daniilidis; Igor Mordatch; Sergey Levine

Model-Based Reinforcement Learning via Latent-Space Collocation

Oleh Rybkin, Chuning Zhu, Anusha Nagabandi, Kostas Daniilidis, Igor Mordatch, Sergey Levine

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: visual model-based reinforcement learning, visual planning, long-horizon planning, collocation

Abstract: The ability to construct and execute long-term plans enables intelligent agents to solve complex multi-step tasks and prevents myopic behavior only seeking the short-term reward. Recent work has achieved significant progress on building agents that can predict and plan from raw visual observations. However, existing visual planning methods still require a densely shaped reward that provides the algorithm with a short-term signal that is always easy to optimize. These algorithms fail when the shaped reward is not available as they use simplistic planning methods such as sampling-based random shooting and are unable to plan for a distant goal. Instead, to achieve long-horizon visual control, we propose to use collocation-based planning, a powerful optimal control technique that plans forward a sequence of states while constraining the transitions to be physical. We propose a planning algorithm that adapts collocation to visual planning by leveraging probabilistic latent variable models. A model-based reinforcement learning agent equipped with our planning algorithm significantly outperforms prior model-based agents on challenging visual control tasks with sparse rewards and long-term goals.

One-sentence Summary: We propose a visual model-based reinforcement agent that uses collocation in the latent space to plan and outperforms prior shooting-based planning methods.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/model-based-reinforcement-learning-via-latent/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=wCtJtcLkgx

20 Replies

Loading