Abstract: Variational autoencoders (VAE) powerfully model high-dimensional data
by combining classical probabilistic modeling and neural networks. In
a VAE, each datapoint is generated by a low-dimensional latent
variable popped through some flexible implicit distribution
parametrized by neural networks. A VAE then employs variational
inference to infer the posterior of these latents. Though flexible,
VAEs often suffer from posterior collapse, a phenomenon where the
inferred posterior of the latent variables is equal to its
(uninformative) prior. In this paper, we study the posterior collapse
phenomenon from the lens of latent variable identifiability. We find
that posterior collapse can occur in classical probabilistic models
(e.g. Gaussian mixture models) fitted with Markov chain Monte Carlo
methods whenever their latent variables are not identifiable. It is
contrary to the belief that posterior collapse is specific to VAEs
that use neural networks or variational approximation. Mathematically,
we prove that posterior collapses if and only if the generative model
is non-identifiable. Further, a VAE is non-identifiable, hence
suffering from posterior collapse. Finally, we propose a class of
identifiable VAEs that leverage optimal transport maps to resolve the
latent variable non-identifiability. Across datasets, we show that the
identifiable VAEs prevents posterior collapse.
1 Reply
Loading