Dream and Search to Control: Latent Space Planning for Continuous Control

Anurag Koul; Varun Kumar Vijay; Alan Fern; Somdeb Majumdar

Dream and Search to Control: Latent Space Planning for Continuous Control

Anurag Koul, Varun Kumar Vijay, Alan Fern, Somdeb Majumdar

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Reinforcement Learning, Model Based RL, Continuous Control, Search, Planning, MCTS

Abstract: Learning and planning with latent space dynamics has been shown to be useful for sample efficiency in model-based reinforcement learning (MBRL) for discrete and continuous control tasks. In particular, recent work, for discrete action spaces, demonstrated the effectiveness of latent-space planning via Monte-Carlo Tree Search (MCTS) for bootstrapping MBRL during learning and at test time. However, the potential gains from latent-space tree search have not yet been demonstrated for environments with continuous action spaces. In this work, we propose and explore an MBRL approach for continuous action spaces based on tree-based planning over learned latent dynamics. We show that it is possible to demonstrate the types of bootstrapping benefits as previously shown for discrete spaces. In particular, the approach achieves improved sample efficiency and performance on a majority of challenging continuous-control benchmarks compared to the state-of-the-art.

One-sentence Summary: We show that performing tree-based search on learnt, latent dynamics as a planning mechanism for continuous control outperforms Dreamer.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Supplementary Material: zip

Reviewed Version (pdf): https://openreview.net/references/pdf?id=QIaDcH_FXM

10 Replies

Loading