Keywords: generative modeling, autoregressive models, score-based methods, diffusion models
TL;DR: We improve the sample quality of autoregressive models by proposing a simple tweak to the standard maximum likelihood training procedure, and incorporating score-based sampling techniques inspired by score-based diffusion models.
Abstract: We introduce a simple yet effective modification to the standard maximum likelihood estimation (MLE) framework for autoregressive generative models. Rather than maximizing a single unconditional likelihood of the data under the model, we maximize a family of \textit{noise-conditional} likelihoods consisting of the data perturbed by a continuum of noise levels. We find that models trained this way are more robust to noise, obtain higher test likelihoods, and generate higher quality images. They can also be sampled from via a novel score-based sampling scheme which combats the classical \textit{covariate shift} problem that occurs during sample generation in autoregressive models. Applying this augmentation to autoregressive image models, we obtain 3.32 bits per dimension on the ImageNet 64x64 dataset, and substantially improve the quality of generated samples in terms of the Frechet Inception distance (FID) --- from 37.50 to 13.50 on the CIFAR-10 dataset.
Student Paper: Yes
1 Reply
Loading