Abstract: Variational autoencoders with deep hierarchies of stochastic layers have been known to suffer from the problem of posterior collapse, where the top layers fall back to the prior and become independent of input.
We suggest that the hierarchical VAE objective explicitly includes the variance of the function parameterizing the mean and variance of the latent Gaussian distribution which itself is often a high variance function.
Building on this we generalize VAE neural networks by incorporating a smoothing parameter motivated by Gaussian analysis to reduce variance in parameterizing functions and show that this can help to solve the problem of posterior collapse.
We further show that under such smoothing the VAE loss exhibits a phase transition, where the top layer KL divergence sharply drops to zero at a critical value of the smoothing parameter.
We validate the phenomenon across model configurations and datasets.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Supplementary Material: zip
Reviewed Version (pdf): https://openreview.net/references/pdf?id=wrNj29-80Y
20 Replies
Loading