Identifiable Representation Learning via Architecture Equivariances

Marian Longa; Joao F. Henriques

Identifiable Representation Learning via Architecture Equivariances

Marian Longa, Joao F. Henriques

19 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Keywords: representation learning, identifiability

TL;DR: We propose an unsupervised equivariance-based method for learning identifiable representations of videos that can be intervened on to generate realistic counterfactual videos.

Abstract: Despite their immense success and usefulness, current deep learning systems are still lacking in interpretability, robustness, and out of distribution generalisation. In this work we propose a method that helps address some of these issues in image and video data, by exploiting equivariances naturally present in the data. It enables learning latent representations that are identifiable and interpretable, and that can be intervened on to visualise counterfactual scenarios. The latent representations naturally correspond to positions of objects subject to image transformations, and so our method trains object detectors completely unsupervised, without object annotations. We prove that the learned latent variables are identifiable up to permutations and small shifts up to the size of model's receptive fields, and perform experiments demonstrating this in practice. We apply it to real world videos of balls moving in mini pool (translational equivariance), cars driving around a roundabout (rotational equivariance) and objects approaching the camera on a conveyor belt (scale equivariance). In all cases, transformation-equivariant representations are learned unsupervised. We show that intervening on the learned latent space results in successful generalisation out of the training distribution, and visualise realistic counterfactual videos never observed at training time. The method has natural applications in industry, such as inspection and surveillance, with static cameras.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1863

Loading