Keywords: deep generative models, variational autoencoders
Abstract: Good likelihoods do not imply great sample quality. However, the precise manner in which models trained to achieve good likelihoods fail at sample quality remains poorly understood. In this work, we consider the task of image generative modeling with variational autoencoders and posit that the nature of high-dimensional image data distributions poses an intrinsic challenge. In particular, much of the entropy in these natural image distributions is attributable to visually imperceptible information. This signal dominates the training objective, giving models an easy way to achieve competitive likelihoods without successful modeling of the visually perceptible bits. Based on this hypothesis, we decompose the task of generative modeling explicitly into two steps: we first prioritize the modeling of visually perceptible information to achieve good sample quality, and then subsequently model the imperceptible information---the bulk of the likelihood signal---to achieve good likelihoods. Our work highlights the well-known adage that "not all bits are created equal" and demonstrates that this property can and should be exploited in the design of variational autoencoders.
One-sentence Summary: We prioritize the modeling of visually perceptible bits
15 Replies
Loading