On Rollouts in Model-Based Reinforcement Learning

Bernd Frauenknecht; Devdutt Subhasish; Friedrich Solowjow; Sebastian Trimpe

On Rollouts in Model-Based Reinforcement Learning

Bernd Frauenknecht, Devdutt Subhasish, Friedrich Solowjow, Sebastian Trimpe

Published: 17 Jul 2025, Last Modified: 07 Oct 2025EWRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Model-Based Reinforcement Learning, Model Rollouts, Uncertainty Quantification

TL;DR: We propose Infoprop a novel rollout mechanism for model-based reinforcement learning yielding substantially improved data consistency and long-term planning capabilities.

Abstract: Model-based reinforcement learning (MBRL) seeks to enhance data efficiency by learning a model of the environment and generating synthetic rollouts from it. However, accumulated model errors during these rollouts can distort the data distribution, negatively impacting policy learning and hindering long-term planning. Thus, the accumulation of model errors is a key bottleneck in current MBRL methods. We propose Infoprop, a model-based rollout mechanism that separates aleatoric from epistemic model uncertainty and reduces the influence of the latter on the data distribution. Further, Infoprop keeps track of accumulated model errors along a model rollout and provides termination criteria to limit data corruption. We demonstrate the capabilities of Infoprop in the Infoprop-Dyna algorithm, reporting state-of-the-art performance in Dyna-style MBRL on common MuJoCo benchmark tasks while substantially increasing rollout length and data quality.

Confirmation: I understand that authors of each paper submitted to EWRL may be asked to review 2-3 other submissions to EWRL.

Serve As Reviewer: ~Sebastian_Trimpe1

Track: Fast Track: published work

Publication Link: https://openreview.net/forum?id=Uh5GRmLlvt

Submission Number: 88

Loading