Correcting Flaws in Common Disentanglement Metrics

Louis Mahon; Lei Sha; Thomas Lukasiewicz

Correcting Flaws in Common Disentanglement Metrics

Louis Mahon, Lei Sha, Thomas Lukasiewicz

Published: 22 Jul 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Disentangled representations are those in which distinct features, such as size or shape, are represented by distinct neurons. Quantifying the extent to which a given representation is disentangled is not straightforward; multiple metrics have been proposed. In this paper, we identify two failings of existing metrics, which mean they can assign a high score to a model which is still entangled, and we propose two new metrics, which redress these problems. First, we use hypothetical toy examples to demonstrate the failure modes we identify for existing metrics. Then, we show that similar situations occur in practice. Finally, we validate our metrics on the downstream task of compositional generalization. We measure the performance of six existing disentanglement models on this downstream compositional generalization task, and show that performance is (a) generally quite poor, (b) correlated, to varying degrees, with most disentanglement metrics, and (c) most strongly correlated with our newly proposed metrics. Anonymous code to reproduce our results is available at https://github.com/anon296/anon.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: n/a

Video: https://youtu.be/d0kpCvTbz0U

Code: https://github.com/Lou1sM/snc_nk

Assigned Action Editor: ~Simon_Kornblith1

Submission Number: 2320

Loading