Is the Discrete VAE’s Power Stuck in its Prior?Download PDF

Published: 09 Dec 2020, Last Modified: 05 May 2023ICBINB 2020 PosterReaders: Everyone
Keywords: discrete neural, vector quantized variational autoencoder, vq-vae
TL;DR: Generative neural models with a discrete latent space appear to rely on a powerful prior distribution to generate samples; increasing the expressiveness of the encoder leads to worse model fit.
Abstract: We investigate why probabilistic neural models with discrete latent variables are effective at generating high-quality images. We hypothesize that fitting a more flexible variational posterior distribution and performing joint training of the encoder, decoder, and prior distribution should improve model fit. However, we find that modifying the training procedure for the well-known vector quantized variational autoencoder (VQ-VAE) leads to models with lower marginal likelihood for held-out data and degraded sample quality. These results indicate that current discrete VAEs use their encoder and decoder as a deterministic compression bottleneck. The distribution-matching power of these models lies solely in the prior distribution, which is typically trained after clamping the encoder and decoder.
1 Reply

Loading