Can Push-forward Generative Models Fit Multimodal Distributions?Download PDF

Published: 31 Oct 2022, Last Modified: 12 Mar 2024NeurIPS 2022 AcceptReaders: Everyone
Keywords: Generative Models, GAN, VAE, Diffusion Models, Score-based Models, Expressivity, Lipschitz Mappings
Abstract: Many generative models synthesize data by transforming a standard Gaussian random variable using a deterministic neural network. Among these models are the Variational Autoencoders and the Generative Adversarial Networks. In this work, we call them "push-forward" models and study their expressivity. We formally demonstrate that the Lipschitz constant of these generative networks has to be large in order to fit multimodal distributions. More precisely, we show that the total variation distance and the Kullback-Leibler divergence between the generated and the data distribution are bounded from below by a constant depending on the mode separation and the Lipschitz constant. Since constraining the Lipschitz constants of neural networks is a common way to stabilize generative models, there is a provable trade-off between the ability of push-forward models to approximate multimodal distributions and the stability of their training. We validate our findings on one-dimensional and image datasets and empirically show that the recently introduced diffusion models do not suffer of such limitation.
TL;DR: This paper highlights that there exists a trade-off for GANs and VAEs between their expressivity and the stability of their training.
Supplementary Material: pdf
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](
21 Replies