DIAMOND-LoL: Enforcing Lieb-Robinson Locality in Diffusion World Models for Long-Horizon Consistency
Keywords: World Models; Lieb-Robinson Locality; Reinforcement Learning; Diffusion Models
Abstract: The world model enables agents to perform reasoning and planning through learning in the simulator, improving the sample efficiency in reinforcement learning. However, while diffusion-based models circumvent detail loss by operating in pixel space, their standard $\ell_2$ loss introduces a critical physical inconsistency. Specifically, by averaging over plausible futures in partially observable scenarios, it generates blurry boundaries and acausal displacements, artifacts that violate the environment's fundamental principle of finite-speed propagation. To address this challenge, we propose \textbf{DIAMOND-LoL}, a diffusion training framework, which adds a \textbf{L}ieb-R\textbf{o}binson \textbf{L}ocality loss (LoL loss) to enforce the finite speed propagation of pixel dynamics. Based on the Lieb-Robinson bound, LoL loss penalizes structural changes outside the data-driven light cone radius, keeping the predictions within the reachable range of the environment and avoiding mode averaging interpolation. Moreover, we prove that LoL loss is zero only when the prediction boundary is within the finite propagation set, and we show that it converts the long-term error growth from exponential form to linear form. Experiments demonstrate that DIAMOND-LoL provides a principled and physically consistent training objective for diffusion world models, especially having significant value in safety-critical scenarios.
Supplementary Material:  zip
Submission Number: 14
Loading