CroCoDiLight: Repurposing Cross-View Completion Encoders for Relighting

Alistair J Foggin; William A P Smith

CroCoDiLight: Repurposing Cross-View Completion Encoders for Relighting

Alistair J Foggin, William A P Smith

Published: 26 Jan 2026, Last Modified: 11 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: cross-view completion, relighting, intrinsic image estimation, albedo estimation, shadow removal

TL;DR: Disentangle CroCo latents into lighting and scene intrinsics, edit lighting for shadow removal, albedo estimation, relighting and lighting interpolation.

Abstract: Cross-view completion (CroCo) has proven effective as pre-training for geometric downstream tasks such as stereo depth, optical flow, and point cloud prediction. In this paper we show that it also learns photometric understanding due to training pairs with differing illumination. We propose a method to disentangle CroCo latent representations into a single latent vector representing illumination and patch-wise latent vectors representing intrinsic properties of the scene. To do so, we use self-supervised cross-lighting and intrinsic consistency losses on a dataset two orders of magnitude smaller than that used to train CroCo. This comprises pixel-wise aligned, paired images under different illumination. We further show that the lighting latent can be used and manipulated for tasks such as interpolation between lighting conditions, shadow removal, and albedo estimation. This clearly demonstrates the feasibility of using cross-view completion as pre-training for photometric downstream tasks where training data is more limited.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 1536

Loading