Quantifying the similarity of information contained in probabilistic latent spaces

19 Sept 2024 (modified: 08 Oct 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Information theory, representation learning, disentanglement
TL;DR: We compare the information content of probabilistic representation spaces, and use it to analyze VAE latent spaces and to perform ensemble learning.
Abstract: In contrast to point-based representation spaces, probabilistic representation spaces have a well-defined sense in which they compress information about a dataset. When viewing representation spaces as communication channels, it becomes natural to ask about the similarity of information content of different representation spaces. Starting with classic measures of similarity of hard clustering assignments, we propose a natural modification that generalizes to probabilistic representation spaces. We also propose a practical route toward estimating the similarity measure based on fingerprinting a representation space with a sample of the dataset that is applicable when the transmitted information is only a handful of bits. Equipped with the similarity measures, we build upon model centrality as a signature of unsupervised disentanglement by assessing ``channel centrality'' and finding information fragments that are repeatedly learned in VAE and InfoGAN ensembles. Additionally, we evaluate the diversity of information content of the full latent space over the course of training for ensembles of models, and find a striking difference in homogeneity of information depending on the dataset. Finally, we leverage the differentiability of the proposed method and perform ensemble learning with VAEs by boosting the information content of a set of weak learners incapable of representing the global structure of a dataset.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1896
Loading