Neural Bayes: A Generic Parameterization Method for Unsupervised Learning

Devansh Arpit; Huan Wang; Caiming Xiong; richard socher; Yoshua Bengio

Neural Bayes: A Generic Parameterization Method for Unsupervised Learning

Devansh Arpit, Huan Wang, Caiming Xiong, richard socher, Yoshua Bengio

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: unsupervised learning, clustering, manifold separation, representation learning, Bayes rule

Abstract: We introduce a parameterization method called Neural Bayes which allows computing statistical quantities that are in general difficult to compute and opens avenues for formulating new objectives for unsupervised representation learning. Specifically, given an observed random variable $\mathbf{x}$ and a latent discrete variable $z$, we can express $p(\mathbf{x}|z)$, $p(z|\mathbf{x})$ and $p(z)$ in closed form in terms of a sufficiently expressive function (Eg. neural network) using our parameterization without restricting the class of these distributions. To demonstrate its usefulness, we develop two independent use cases for this parameterization: 1. Disjoint Manifold Separation: Neural Bayes allows us to formulate an objective which can optimally label samples from disjoint manifolds present in the support of a continuous distribution. This can be seen as a specific form of clustering where each disjoint manifold in the support is a separate cluster. We design clustering tasks that obey this formulation and empirically show that the model optimally labels the disjoint manifolds. 2. Mutual Information Maximization (MIM): MIM has become a popular means for self-supervised representation learning. Neural Bayes allows us to compute mutual information between observed random variables $\mathbf{x}$ and latent discrete random variables $z$ in closed form. We use this for learning image representations and show its usefulness on downstream classification tasks.

One-sentence Summary: We propose a simple neural network parameterization based on the Bayes rule to model conditional distributions, and then use it to formulate two objectives: 1. manifold separation; mutual information maximization.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/neural-bayes-a-generic-parameterization/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=_4qHZipfIp

5 Replies

Loading