Abstract: Despite achieving state-of-the-art rate-distortion performance exceeding VVC in PSNR and MS-SSIM, recently learned image compression (LIC) methods still exhibit significant perceptual limitations at low bitrates. Reconstructed images often suffer from blurring, inaccurate colors, and lack of textural detail, highlighting the well-known divergence between conventional metrics and human visual perception.
Although several perceptual LIC approaches have been proposed to bridge this gap, many are hampered by unstable training that hinders their practical applicability.
To bridge this gap, we propose ST-LIC (Stable Training for Perception-Oriented Learned Image Compression). Our approach introduces two key innovations for stable and effective perceptual optimization.
First, during the initial training phase, we analyze the gradient contribution of each loss component to identify a balance point, preventing any single loss from dominating or becoming negligible during updates.
Second, we integrate a UNet-based refiner module after the decoder. This module applies distortion and perceptual losses to distinct outputs, enabling a more precise and balanced optimization of the Rate-Distortion-Perception trade-off.
Experimental results demonstrate that ST-LIC achieves significantly more stable training when incorporating adversarial loss while simultaneously delivering reconstructions with superior subjective visual quality.
Team Name: Evolve
Submission Number: 10
Loading