Keywords: world models, diffusion, model-basedreinforcement learning
Abstract: We study diffusion-based world models for reinforcement learning, which offer high generative fidelity but face critical efficiency challenges in control.
Current methods either require heavyweight models at inference or rely on highly sequential imagination, both of which impose prohibitive computational costs.
We propose Horizon Imagination (HI), an on-policy imagination process for discrete stochastic policies that denoises multiple future observations in parallel.
HI incorporates a stabilization mechanism and a novel sampling schedule that decouples the denoising budget from the effective horizon over which denoising is applied while also supporting fractional steps-per-frame budgets (sub-step budgets).
Experiments on Atari 100K and Craftium show that our approach maintains control performance with a sub-step budget of half the denoising steps (i.e., 0.5 denoising steps per frame) and achieves superior generation quality under varied schedules.
Code is available at https://github.com/leor-c/horizon-imagination.
Primary Area: reinforcement learning
Submission Number: 7791
Loading