What Makes a Representation Relightable? Probing Visual Priors via Augmented Latent Intrinsics

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Relighting, Probing, Latent Intrinsics, Visual Priors, Representations
Abstract: Image-to-image relighting requires a representation that disentangles scene properties from illumination. Recent methods use latent intrinsic representations but remain under-constrained and often fail on challenging materials like metal and glass. A natural hypothesis is that injecting powerful, pretrained visual priors should resolve these failures. We find the opposite is true: features from top-performing semantic encoders often degrade relighting quality, revealing a fundamental trade-off between semantic abstraction and photometric fidelity. This paper investigates what makes a representation ``relightable." We introduce Augmented Latent Intrinsics (ALI), a method that resolves this trade-off by strategically fusing features from a dense, pixel-aligned visual encoder into a latent-intrinsic framework while leveraging self-supervision refinement to overcome the scarcity of paired real-world training data. Trained only on unlabeled, real-world image pairs, ALI achieves strong relighting improvements with the largest gains on complex materials.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 13477
Loading