Specializing CGRAs for Light-Weight Convolutional Neural Networks

Published: 01 Jan 2022, Last Modified: 06 Feb 2025IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Deep neural network (DNN) processing units, or DPUs, are one of the most energy-efficient platforms for DNN applications. However, designing new DPUs for every DNN model is very costly and time consuming. In this article, we propose an alternative approach: to specialize coarse-grained reconfigurable architectures (CGRAs), which are already quite capable of delivering high performance and high energy efficiency for compute-intensive kernels. We identify a small set of architectural features on a baseline CGRA to enable high-performance mapping of depthwise convolution (DWC) and pointwise convolution (PWC) kernels, which are the most important building block in recent light-weight DNN models. Our experimental results using MobileNets demonstrate that our proposed CGRA enhancement can deliver $8\sim 18\times $ improvement in area-delay product (ADP) depending on layer type, over a baseline CGRA with a state-of-the-art CGRA compiler. Moreover, our proposed CGRA architecture can also speed up 3-D convolution with similar efficiency as previous work, demonstrating the effectiveness of our architectural features beyond depthwise separable convolution (DSC) layers.
Loading