- Abstract: Models parameterized by deep neural networks have achieved state-of-the-art results in many domains. These models are usually trained using the maximum likelihood principle with a finite set of observations. However, training deep probabilistic models with maximum likelihood can lead to the issue we refer to as input forgetting. In deep generative latent-variable models, input forgetting corresponds to posterior collapse---a phenomenon in which the latent variables are driven independent from the observations. However input forgetting can happen even in the absence of latent variables. We attribute input forgetting in deep probabilistic models to the finite sample dilemma of maximum likelihood. We formalize this problem and propose a learning criterion---termed reflective likelihood---that explicitly prevents input forgetting. We empirically observe that the proposed criterion significantly outperforms the maximum likelihood objective when used in classification under a skewed class distribution. Furthermore, the reflective likelihood objective prevents posterior collapse when used to train stochastic auto-encoders with amortized inference. For example in a neural topic modeling experiment, the reflective likelihood objective leads to better quantitative and qualitative results than the variational auto-encoder and the importance-weighted auto-encoder.
- Keywords: new learning criterion, penalized maximum likelihood, posterior inference in deep generative models, input forgetting issue, latent variable collapse issue
- TL;DR: Training deep probabilistic models with maximum likelihood often leads to "input forgetting". We identify a potential cause and propose a new learning criterion to alleviate the issue.