Correcting Flaws in Common Disentanglement Metrics

Louis Mahon; Lei Sha; Thomas Lukasiewicz

Correcting Flaws in Common Disentanglement Metrics

Louis Mahon, Lei Sha, Thomas Lukasiewicz

18 Sept 2023 (modified: 06 Aug 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: disentanglement, metrics, compositional generalization

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

TL;DR: We identify two failings in existing disentanglement metrics, propose two new metrics to fix them, and show that our metrics are more predictive on a downstream task.

Abstract: Recent years have seen growing interest in learning disentangled representations, in which distinct features, such as size or shape, are represented by distinct neurons. Quantifying the extent to which a given representation is disentangled is not straightforward; multiple metrics have been proposed. In this paper, we identify two failings of existing metrics, which mean they can assign a high score to a model which is still entangled, and we propose two new metrics, which redress these problems. First, we demonstrate these failure modes on hypothetical toy examples, then we show that similar situations occur in practice, and finally we validate our metrics on the downstream task of compositional generalization. We show that performance on this task is (a) generally quite poor, (b) correlated with most disentanglement metrics, and (c) most strongly correlated with our newly proposed metrics.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 1390

Loading