Keywords: compression, information theory, deep learning, perceptual quality, rate-distortion
TL;DR: We demonstrate that with a suitable encoding scheme, modifying the decoder is sufficient to approximately achieve any point along the perception-distortion tradeoff in practice.
Abstract: In the context of lossy compression, \citet{blau2019rethinking} adopt a mathematical notion of perceptual quality and define the rate-distortion-perception function, generalizing the classical rate-distortion tradeoff. We consider the notion of (approximately) universal representations in which one may fix an encoder and vary the decoder to (approximately) achieve any point along the perception-distortion tradeoff. We show that the penalty for fixing the encoder is zero in the Gaussian case, and give bounds in the case of arbitrary distributions, under MSE distortion and $W_2^2(\cdot,\cdot)$ perception losses. In principle, a small penalty refutes the need to design an end-to-end system for each particular objective. We provide experimental results on MNIST and SVHN to suggest that there exist practical constructions that suffer only a small penalty, i.e. machine learning models learn representation maps which are approximately universal within their operational capacities.
1 Reply
Loading