CausalPhysics: Unifying Semantic Reasoning, Physical Dynamics, and Counterfactual Simulation in World Models
Keywords: world models, causal reasoning, physical AI, counterfactual simulation, video understanding, physics-informed learning, graph neural networks, semantic grounding
TL;DR: A unified framework combining semantic understanding, causal graph learning, and physics constraints that achieves 46.8% on Physics-IQ benchmark (+95% over Sora) and 71.3% causal consistency (+226% over GPT-4V).
Abstract: Current world models fragment physical intelligence into separate pipelines. Vision language models (VLMs) excel at semantic tasks but struggle with causal physical
reasoning: on our CAUSALPHYSICS-BENCH evaluation, GPT-4V answers only
21.9% of counterfactual physics queries correctly. Video generators produce
realistic frames but understand little physics: Sora attains 24.1%, Runway Gen-3
23.2%, and VideoPoet 21.4% on Physics-IQ (Motamed et al., 2025). Model-based
reinforcement learning (MBRL) systems operate in narrow domains and lack
semantic grounding.
We present CAUSALPHYSICS, a single architecture that bridges these gaps with
three tightly coupled modules: (1) a Semantic-Physical Encoder (SPE) that
fuses DINOv2 vision tokens with frozen LLaMA-2 language representations
through cross-attention; (2) a Causal Graph Induction Module (CGIM) that
discovers a differentiable structural causal model from video, supporting Pearl’s
do-operator and counterfactual queries; (3) a Physics-Constrained Dynamics
Network (PCDN) that propagates states through the learned causal graph while
enforcing differentiable conservation-law constraints.
On the official Physics-IQ v1.0 toolkit, CAUSALPHYSICS scores 46.8 ± 0.9—a
47% relative gain over V-JEPA 2 (31.8 ± 1.4) and roughly double Sora (24.1).
Causal consistency reaches 71.3 ± 1.2% on CAUSALPHYSICS-BENCH versus
21.9 ± 0.8% for GPT-4V ($p<0.001$, paired t-test, 3 seeds). Out-of-distribution
(OOD) generalization improves by 20.2 percentage points over the strongest baseline
Submission Number: 39
Loading