FDAM: Filter-Dedicated Approximate Multiplier Design for Real-Time CNN Acceleration

Published: 01 Jan 2025, Last Modified: 07 Nov 2025IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Video super-resolution (VSR) is widely used in various high-definition applications, such as HDTVs and smartphones, requiring a dedicated upscaling technique for real-time full-HD generation. To reduce on-chip buffers for large-size output feature maps, a streaming VSR accelerator may employ an output stationary dataflow, leading to a large energy consumption caused by frequent filter switching. To mitigate this, we introduce a new filter-dedicated multiplier design for real-time VSR acceleration. We replace costly multipliers with adders, shifters, and multiplexers (MUXes), referred to as unified multiple constant multiplications (UMCM). The conventional UMCM, however, may incur considerable area/power overhead due to the unified topology constraint among different filter sets. To address this problem, we propose a new approximated MCM (AMCM) problem to relax the constraint and an approximate compatible graph synthesis (A-CGS) framework to efficiently solve AMCM by jointly searching for approximated filters and constructing a unified graph. Additionally, we suggest a lightweight fine-tuning method by freezing approximated filters and only fine-tuning biases, which can recover the original model’s accuracy within a few epochs. Experimental results with synthetic data demonstrate that AMCM reduces the area by up to 49.8%, 44.8%, and 40.3% when considering constant sets of 2, 4, and 8, respectively. Our designs with SR application achieve up to a 73.3% reduction in energy consumption. Experiments with the Set5 and Set14 datasets show that our model with bias correction achieves similar restoration performance compared to the eight-bit models.
Loading