Posterior Collapse and Latent Variable Non-identifiabilityDownload PDF

Nov 23, 2020 (edited Jan 06, 2021)AABI2020Readers: Everyone
  • Abstract: Variational autoencoders (VAE) powerfully model high-dimensional data by combining classical probabilistic modeling and neural networks. In a VAE, each datapoint is generated by a low-dimensional latent variable popped through some flexible implicit distribution parametrized by neural networks. A VAE then employs variational inference to infer the posterior of these latents. Though flexible, VAEs often suffer from posterior collapse, a phenomenon where the inferred posterior of the latent variables is equal to its (uninformative) prior. In this paper, we study the posterior collapse phenomenon from the lens of latent variable identifiability. We find that posterior collapse can occur in classical probabilistic models (e.g. Gaussian mixture models) fitted with Markov chain Monte Carlo methods whenever their latent variables are not identifiable. It is contrary to the belief that posterior collapse is specific to VAEs that use neural networks or variational approximation. Mathematically, we prove that posterior collapses if and only if the generative model is non-identifiable. Further, a VAE is non-identifiable, hence suffering from posterior collapse. Finally, we propose a class of identifiable VAEs that leverage optimal transport maps to resolve the latent variable non-identifiability. Across datasets, we show that the identifiable VAEs prevents posterior collapse.
1 Reply