Keywords: Concept Discovery (e.g., SAEs, dictionary learning), Methods (probing, steering, causal interventions), Feature Geometry
Other Keywords: video models, V-JEPA, causal degeneracy, representation portability, subspace extraction, self-supervised learning
TL;DR: We apply a causal intervention pipeline to V-JEPA 2, revealing early-layer causal encoding of motion and discovering extreme "causal degeneracy" where geometrically distinct subspaces hold equivalent functional roles.
Abstract: Video world models trained with Joint
Embedding Predictive Architectures (JEPAs)
achieve strong performance on motion
understanding benchmarks, but whether
their latent representations encode causally
functional state variables remains unknown.
We apply a three-stage causal-statediscovery
pipeline—combining L1-regularized probing,
class-conditional PCA, difference-in-means
subspace extraction, and three families of causal
interventions with four matched controls—to
thefrozenencoderofV-JEPA2ViT-L(326M
parameters, d=1024, 24layers) on a synthetic
controlled-sequence dataset of 400 clips across
8 motion directions. V-JEPA 2 encodes motion
direction from remarkably early layers (96%
dense-probe accuracy at layer 4; 100% by
layer 7), using a distributed subspace occupying
57% of latent dimensions. Causal ablation at
layer 7 produces effects 43× larger than random
direction controls, confirming the identified
subspace is causally privileged. The SAS–RCE
dissociation—moderate subspace alignment
(SAS= 0.35) coexisting with near-perfect
retained causal effect (RCE=0.99)—reveals
that causal structure is far more stable than its
geometric embedding. Findings generalize to
complex synthetic stimuli, real Kineticsvideo
(5.3×CE ratio), and V-JEPA 2 ViT-H (54×
CE ratio with near-perfect cross-architecture
CCA alignment). These results provide the
first intervention-based evidence that JEPA
video models encode motion as a causally
functional latent variable, and introduce SAS
and RCE as portability metrics for mechanistic
interpretability.
Submission Number: 11
Loading