Rethinking RGB Color Representation for Image Restoration Models

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: representation learning, image restoration, representation space, loss function, image super-resolution, image deblurring, image denoising, interpretability
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: As an alternative to the three color channels, we present a new representation space for images, where pairwise per-pixel loss functions are redefined for better training of various image restoration models.
Abstract: The per-pixel distance loss defined in the RGB color domain has been almost a compulsory choice for training image restoration models, despite its well-known tendency to guide the model to produce blurry, unrealistic textures. To enhance the visual plausibility of restored images, recent methods employ auxiliary objectives such as perceptual or adversarial losses. Nevertheless, they still do not eliminate the reliance on the per-pixel distance in the RGB domain. In this work, we try to redefine the very representation space over which the per-pixel distance is measured. Our augmented RGB ($a$RGB) space is the latent space of an autoencoder that comprises a single affine decoder and a nonlinear encoder, trained to preserve color information while capturing low-level image structures. As a direct consequence, per-pixel distance metrics, e.g., $L_{1}$, $L_{2}$, and smooth $L_{1}$ losses, can also be defined over our $a$RGB space in the same way as for the RGB space. We then replace the per-pixel losses in the RGB space with their counterparts in training various image restoration models such as deblurring, denoising, and perceptual super-resolution. By simply redirecting the loss function to act upon the proposed $a$RGB space, we demonstrate boosted performance without any modification to model architectures or other hyperparameters. Our results imply that the RGB color is not the optimal representation for image restoration tasks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2244
Loading