InfoNCE is variational inference in a recognition parameterised model

Laurence Aitchison; Stoil Krasimirov Ganev

InfoNCE is variational inference in a recognition parameterised model

Laurence Aitchison, Stoil Krasimirov Ganev

Published: 26 Mar 2024, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Here, we develop a new class of Bayesian latent variable model, the recognition parameterised model (RPM). RPMs have an implicit likelihood, which is defined in terms of the recognition model. Therefore, it is not possible to do traditional "generation" with RPMs. Instead, RPMs are designed to learn good latent representations of data (in modern parlance, they solve a self-supervised learning task). Indeed, the RPM implicit likelihood is specifically designed so that it drops out of the VI objective, the ELBO. That allows us to learn an RPM without a "reconstruction" step, which is believed to be at the root of poor latent representations learned by VAEs. Indeed, in a very specific setting where we learn the optimal prior, the RPM ELBO becomes equal to the mutual information (MI; up to a constant), establishing a connection to pre-existing self-supervised learning methods such as InfoNCE.

Submission Length: Regular submission (no more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=SGNIcTOtvG&referrer=%5BAuthor%20Console%5D(%2Fgroup%3Fid%3DTMLR%2FAuthors%23your-submissions)

Changes Since Last Submission: Changes to: * Respond to all AC comments from the previous review. * Clarify the work. * Contextualise the work in terms of e.g. Latent Dirichlet Allocation (which is a Bayesian generative model designed to extract useful representations, and not to sample). * Contextualise the work in terms of some of the follow-up work on RPMs. * Give a unified mathematical framework encompassing different variants of RPMs, including empirical-marginal, true-marginal and estimated-marginal RPMs (Appendix A). * Discuss measure theoretic considerations for RPMs with deterministic recognition models (Appendix B).

Assigned Action Editor: ~Jinwoo_Shin1

Submission Number: 1749

Loading