Accelerating Convolutional Neural Networks in Frequency Domain via Kernel-Sharing Approach

Published: 01 Jan 2023, Last Modified: 09 Apr 2025ASP-DAC 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Convolutional neural networks (CNNs) are typically computationally heavy. Fast algorithms such as fast Fourier transforms (FFTs), are promising in significantly reducing computation complexity by replacing convolutions with frequency-domain element-wise multiplication. However, the increased high memory access overhead of complex weights counteracts the computing benefit, because frequency-domain convolutions not only pad weights to the same size as input maps, but also have no sharable complex kernel weights. In this work, we propose an FFT-based kernel-sharing technique called FS-Conv to reduce memory access. Based on FS-Conv, we derive the sharable complex weights in frequency-domain convolutions, which has never been solved. FS-Conv includes a hybrid padding approach, which utilizes the inherent periodic characteristic of FFT transformation to provide sharable complex weights for different blocks of complex input maps. We in addition build a frequency-domain inference accelerator (called Yixin) that can utilize the sharable complex weights for CNN accelerations. Evaluation results demonstrate the significant performance and energy efficiency benefits compared with the state-of-the-art baseline.
Loading