Learning Navigable World Models via Latent Energy Shaping

Published: 02 Mar 2026, Last Modified: 15 Apr 2026ICLR 2026 Workshop World ModelsEveryoneRevisionsBibTeXCC BY 4.0
Keywords: World model, Offline Planning, EBM
TL;DR: This paper introduces a latent world model that explicitly shapes the energy landscape to form convex-like basins, enabling reliable gradient-based planning and improving performance on offline goal-conditioned navigation tasks
Abstract: Learning generalizable policies from large, unlabeled offline datasets is a key challenge in creating autonomous agents. While offline goal-conditioned reinforcement learning (GCRL) offers a powerful framework for this, existing methods often struggle with robust long-horizon planning. Model-free approaches can fail to generalize to novel scenarios, while traditional model-based planners must contend with compounding errors and complex, multimodal search spaces that make finding a solution difficult. In this work, we introduce a novel framework that harnesses the expressive power of Energy-Based Models (EBMs) to learn a robust and navigable latent world model. Our central contribution is to adapt a training objective that explicitly shapes the energy landscape; rather than just learning the distribution of plausible transitions, also enable goal-conditioned planning. Our method encourages the composed energy between a start and goal state to form a convex-like basin. This ensures that gradient-based planning reliably converges to a meaningful next-step latent target. Our full pipeline integrates this latent EBM with a distance-preserving state encoder and a skill-conditioned actor to ground latent plans into actions. We evaluate our approach on a suite of challenging offline GCRL benchmarks, where our experiments demonstrate that shaping the energy landscape enables long-horizon planning.
Submission Number: 65
Loading