Tufan: Low-Power Throughput Architecture for Acceleration of EfficientNet on Cloud FPGAs

Mohammadreza Baharani, Ushma Sunil Bharucha, Kaustubh Manohar Mhatre, Hamed Tabkhi

Published: 2021, Last Modified: 22 Nov 2023SoCC 2021Readers: Everyone

Abstract: In recent years, we observe active growth in designing application-specific architectures to accelerate the Convolutional Neural Network (CNN). Among CNN architectures, the recently introduced EfficientNet has emerged as the state-of-the-art CNN, which presents an extensible compound scaling architecture to enhance network capacity to achieve higher accuracy with relatively lower computation demand. However, we see a lack of application-specific architecture support to capitalize on the nuances and benefits of EfficientNet fully. This paper presents Tufan, a throughput-oriented architecture for the acceleration of EfficientNet on Cloud FPGAs. Tufan is a unified framework that supports various EfficientNet family architectures, demonstrating structural sparsity. The accelerator design introduces parameterizable, configurable, and scalable compute units that can be configured based on the user-specific requirement, EfficientNet model, and batch size. We assess the energy efficiency of Tufan for executing a set of EfficientNet family configurations implemented on Xilinx’s Alveo U50 FPGA board and Nvidia Tesla P100 GPU. Our experimental results confirmed that Tufan enhances energy efficiency by 7.81% over P100 GPGPU for a batch size of 28 at 300MHz.

0 Replies