Disentanglement and Generalization Under Correlation ShiftsDownload PDF

Published: 25 Mar 2022, Last Modified: 12 Mar 2024ICLR2022 OSC OralReaders: Everyone
Keywords: disentanglement, correlation, mutual information, conditional mutual information, subspace independence
TL;DR: We establish minimization of conditional mutual information as a more appropriate alternative to mutual information minimization for learning disentangled representations robust to correlation shifts.
Abstract: Correlations between factors of variation are prevalent in real-world data. However, often such correlations are not robust (e.g., they may change between domains, datasets, or applications) and we wish to avoid exploiting them. Disentanglement methods aim to learn representations which capture different factors of variation in latent subspaces. A common approach involves minimizing the mutual information between latent subspaces, such that each encodes a single underlying attribute. However, this fails when attributes are correlated. We solve this problem by enforcing independence between subspaces conditioned on the available attributes, which allows us to remove only dependencies that are not due to the correlation structure present in the training data. We achieve this via an adversarial approach to minimize the conditional mutual information (CMI) between subspaces with respect to categorical variables. We first show theoretically that CMI minimization is a good objective for robust disentanglement on linear problems with Gaussian data. We then apply our method on real-world datasets based on MNIST and CelebA, and show that it yields models that are disentangled and robust under correlation shift, including in weakly supervised settings.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2112.14754/code)
3 Replies