Model-based Reinforcement Learning with Ensembled Model-value Expansion

Gaurav Manek; J Zico Kolter

Model-based Reinforcement Learning with Ensembled Model-value Expansion

Gaurav Manek, J Zico Kolter

29 Sept 2021 (modified: 12 Oct 2025)ICLR 2022 Conference Withdrawn SubmissionReaders: Everyone

Keywords: MBRL, Model-based Reinforcement Learning, Ensemble, Dynamics, Neural Network, Reinforcement Learning

Abstract: Model-based reinforcement learning (MBRL) methods are often more data-efficient and quicker to converge than their model-free counterparts, but typically rely crucially on accurate modeling of the environment dynamics and associated uncertainty in order to perform well. Recent approaches have used ensembles of dynamics models within MBRL to separately capture aleatoric and epistemic uncertainty of the learned dynamics, but many MBRL algorithms are still limited because they treat these dynamics models as a "black box" without fully exploiting the uncertainty modeling. In this paper, we propose a simple but effective approach to improving the performance of MBRL by directly incorporating the ensemble prediction \emph{into} the RL method itself: we propose constructing multiple value roll-outs using different members of the dynamics ensemble, and aggregating the separate estimates to form a joint estimate of the state value. Despite its simplicity, we show that this method substantially improves the performance of MBRL methods: we comprehensively evaluate this technique on common locomotion benchmarks, with ablative experiments to show the added value of our proposed components.

One-sentence Summary: Incorporating ensemble dynamics models into "Dyna-like" MBRL methods.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/model-based-reinforcement-learning-with/code)

10 Replies

Loading