Keywords: World Models, Unsupervised Pre-training, Temporal Relative Embeddings, Horizon-Calibrated Uncertainty
TL;DR: We propose a new world models pre-training framework that explicitly modeling the uncertainty grows with the temporal horizon to learn more robust dynamic representations for downstream RL tasks.
Abstract: Pre-training world models on large, action-free video datasets offers a promising path toward generalist agents, but a fundamental flaw undermines this paradigm. Prevailing methods train models to predict a single, deterministic future, an objective that is ill-posed for inherently stochastic environments where actions are unknown. We contend that a world model should instead learn a structured, probabilistic representation of the future where predictive uncertainty correctly scales with the temporal horizon. To achieve this, we introduce a pre-training framework, **H**orizon-c**A**librated
**U**ncertainty **W**orld **M**odel (HAUWM), built on a probabilistic ensemble that predicts frames at randomly sampled future horizons. The core of our method is a Horizon-Calibrated Uncertainty (HCU) loss, which explicitly shapes the latent space by encouraging predictive variance to grow as the model projects further into the future. This approach yields a latent dynamics model that is not only predictive but also equipped with a reliable measure of temporal confidence. When fine-tuned for downstream control, our pre-trained model significantly outperforms state-of-the-art methods across a diverse suite of benchmarks, including MetaWorld, the DeepMind Control Suite, and RoboDesk. These results highlight the critical role of structured uncertainty in robust decision-making.
Primary Area: reinforcement learning
Submission Number: 14550
Loading