Exploring Efficient Hardware Accelerator for Learning-Based Image Compression

Chen Chen, Haoyang Zhang, Kaicheng Guo, Xingzi Yu, Weidong Qiu, Zhengwei Qi, Haibing Guan

Published: 2025, Last Modified: 18 Mar 2026IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Recently, learning-based image compression (LIC) methods have surpassed manually designed approaches in both compression quality and bitrate. However, increasing computational demands and insufficient optimizations in codec performance have hindered the advancement of LIC acceleration. Most researches focus on optimizing specific components, often neglecting the sources of underutilization during the execution of LIC models. Generally, efficient LIC acceleration encounters three primary challenges: 1) extra overheads introduced by individual optimizations; 2) load and computation imbalances in small kernels; and 3) mismatches between hardware configurations and the LIC models. To address these challenges, we propose a framework named extensive accelerator for LIC (X-LIC) for efficiently exploring the design space under constrained resources. First, we quantitatively characterize a representative LIC model, including its latency, computation size, and temporal utilization across various accelerators. We design a hardware-optimized quantization method to compensate for the lack of LIC-oriented research, particularly regarding data precision, distortion, and resource consumption. Additionally, we propose a parameterized LIC accelerator architecture that integrates seamlessly with existing loop optimization models and supports various LIC operators. Two optimization schemes are proposed for redundant computation in transposed convolution and load and computation imbalance in small kernels. Experimental results show that our framework demonstrates significant flexibility across a broad design space, achieving an average of 78%–95% of the theoretical peak performance and up to 688.2/759.1 GOP/s en/de-coder performance with INT8 precision. As a result, the en/de-coder performance can reach up to 33/36 FPS in 720P resolution. An FPGA demo of X-LIC is available at https://github.com/sjtu-tcloud/X-LIC.
Loading