- Keywords: invertibility, variational autoencoders, encoder, representational complexity, Langevin
- TL;DR: VAEs with invertible mean map have small approximate encoders; non-invertible maps result in large encoders.
- Abstract: Training and using modern neural-network based latent-variable generative models (like Variational Autoencoders) often require simultaneously training a generative direction along with an inferential (encoding) direction, which approximates the posterior distribution over the latent variables. Thus, the question arises: how complex does the inferential model need to be, in order to be able to accurately model the posterior distribution of a given generative model? In this paper, we identify an important property of the generative map impacting the required size of the encoder. We show that if the generative map is ``strongly invertible" (in a sense we suitably formalize), the inferential model need not be much more complex. Conversely, we prove that there exist non-invertible generative maps, for which the encoding direction needs to be exponentially larger (under standard assumptions in computational complexity). Importantly, we do not require the generative model to be layerwise invertible, which a lot of the related literature assumes and isn't satisfied by many architectures used in practice (e.g. convolution and pooling based networks). Thus, we provide theoretical support for the empirical wisdom that learning deep generative models is harder when data lies on a low-dimensional manifold.
- Questions/feedback Request For Reviewers: Though the paper is ostensibly about VAEs (which are not an invertible model), the main question is how the mean-map for the generator being invertible affects the complexity of the encoder. Thus, the core message of the paper is about how invertibility affects the complexity of an inference mechanism.