Causally Disentangled World Models: Guiding Exploration with an Agency Bonus

02 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Causally, World Models
TL;DR: We present our Causal Disentanglement World Model by imposing a structural causal assumption.
Abstract: Model-Based Reinforcement Learning (MBRL) promises to improve sample efficiency, yet conventional world models learn a purely observational, black-box model of dynamics. This leads to causal confounding—a failure to distinguish the environment's autonomous evolution from agent-induced interventions—resulting in poor generalization and inefficient exploration. To resolve this, we introduce the Causal Disentanglement World Model (CDWM), which learns the necessary \textit{interventional} model by imposing a structural causal assumption. Our dual-path architecture decomposes state transitions into an uncontrollable Environment Pathway and a controllable Intervention Pathway, making causal effects identifiable from observational data. Building on this, we derive the Agency Bonus, a principled intrinsic reward that quantifies the agent's causal influence to guide exploration. Extensive experiments on the Atari100k benchmark show CDWM achieves state-of-the-art performance, outperforming prior methods in sample efficiency and planning accuracy. Ablation studies confirm the architecture's adaptability and the exploration mechanism's effectiveness in sparse-reward settings. Our results establish that imposing a causal structure is a critical step toward building more robust, interpretable, and generalizable world models.
Primary Area: reinforcement learning
Submission Number: 781
Loading