Amortised MAP Inference for Image Super-resolution

Casper Kaae Sønderby; Jose Caballero; Lucas Theis; Wenzhe Shi; Ferenc Huszár

Amortised MAP Inference for Image Super-resolution

Casper Kaae Sønderby, Jose Caballero, Lucas Theis, Wenzhe Shi, Ferenc Huszár

Published: 06 Feb 2017, Last Modified: 22 Jun 2025ICLR 2017 OralReaders: Everyone

Abstract: Image super-resolution (SR) is an underdetermined inverse problem, where a large number of plausible high resolution images can explain the same downsampled image. Most current single image SR methods use empirical risk minimisation, often with a pixel-wise mean squared error (MSE) loss. However, the outputs from such methods tend to be blurry, over-smoothed and generally appear implausible. A more desirable approach would employ Maximum a Posteriori (MAP) inference, preferring solutions that always have a high probability under the image prior, and thus appear more plausible. Direct MAP estimation for SR is non-trivial, as it requires us to build a model for the image prior from samples. Here we introduce new methods for \emph{amortised MAP inference} whereby we calculate the MAP estimate directly using a convolutional neural network. We first introduce a novel neural network architecture that performs a projection to the affine subspace of valid SR solutions ensuring that the high resolution output of the network is always consistent with the low resolution input. We show that, using this architecture, the amortised MAP inference problem reduces to minimising the cross-entropy between two distributions, similar to training generative models. We propose three methods to solve this optimisation problem: (1) Generative Adversarial Networks (GAN) (2) denoiser-guided SR which backpropagates gradient-estimates from denoising to train the network, and (3) a baseline method using a maximum-likelihood-trained image prior. Our experiments show that the GAN based approach performs best on real image data. Lastly, we establish a connection between GANs and amortised variational inference as in e.g. variational autoencoders.

TL;DR: Probabilisticly motivated image superresolution using a projection to the subspace of valid solutions

Conflicts: twitter.com, ku.dk, dtu.dk

Keywords: Theory, Computer vision, Deep learning

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/amortised-map-inference-for-image-super/code)

11 Replies

Loading