Model-Based Reinforcement Learning via Latent-Space CollocationDownload PDF

Sep 28, 2020 (edited Mar 05, 2021)ICLR 2021 Conference Blind SubmissionReaders: Everyone
  • Reviewed Version (pdf):
  • Keywords: visual model-based reinforcement learning, visual planning, long-horizon planning, collocation
  • Abstract: The ability to construct and execute long-term plans enables intelligent agents to solve complex multi-step tasks and prevents myopic behavior only seeking the short-term reward. Recent work has achieved significant progress on building agents that can predict and plan from raw visual observations. However, existing visual planning methods still require a densely shaped reward that provides the algorithm with a short-term signal that is always easy to optimize. These algorithms fail when the shaped reward is not available as they use simplistic planning methods such as sampling-based random shooting and are unable to plan for a distant goal. Instead, to achieve long-horizon visual control, we propose to use collocation-based planning, a powerful optimal control technique that plans forward a sequence of states while constraining the transitions to be physical. We propose a planning algorithm that adapts collocation to visual planning by leveraging probabilistic latent variable models. A model-based reinforcement learning agent equipped with our planning algorithm significantly outperforms prior model-based agents on challenging visual control tasks with sparse rewards and long-term goals.
  • One-sentence Summary: We propose a visual model-based reinforcement agent that uses collocation in the latent space to plan and outperforms prior shooting-based planning methods.
  • Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
  • Supplementary Material: zip
20 Replies