- TL;DR: We introduce an unbiased perceptual loss function and metric and show that it improves recovery of texture during super resolution
- Abstract: Single Image Super Resolution (SISR) has significantly improved with Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs), often achieving order of magnitude better pixelwise accuracies (distortions) and state-of-the-art perceptual accuracy. Due to the stochastic nature of GAN reconstruction and the ill-posed nature of the problem, perceptual accuracy tends to correlate inversely with pixelwise accuracy which is especially detrimental to SISR, where preservation of original content is an objective. GAN stochastics can be guided by intermediate loss functions such as the VGG featurewise loss, but these features are typically derived from biased pre-trained networks. Similarly, measurements of perceptual quality such as the human Mean Opinion Score (MOS) and no-reference measures have issues with pre-trained bias. The spatial relationships between pixel values can be measured without bias using the Grey Level Co-occurence Matrix (GLCM), which was found to match the cardinality and comparative value of the MOS while reducing subjectivity and automating the analytical process. In this work, the GLCM is also directly used as a loss function to guide the generation of perceptually accurate images based on spatial collocation of pixel values. We compare GLCM based loss against scenarios where (1) no intermediate guiding loss function, and (2) the VGG feature function are used. Experimental validation is carried on X-ray images of rock samples, characterised by significant number of high frequency texture features. We find GLCM-based loss to result in images with higher pixelwise accuracy and better perceptual scores.
- Keywords: Super Resolution Generative Adversarial Networks, Perceptual Loss Functions