Accelerating Deformable Convolution Networks

Published: 2022, Last Modified: 30 Sept 2024FCCM 2022EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We propose an accelerator design for Deformable Convolution Network inference. Our design is hybrid of hardware pipelining and hardware re-use. (1) Hardware pipelining hide the offset sampling overhead and uses abundant HBM channels to support high throughput, while (2) hardware re-use ensures scalability to very deep networks. Based on our design paradigm, we develop a polynomial-time 2-step design space exploration (DSE) engine to optimize the tradeoff between (1) and (2) to realize high hardware utilization. We implement our design using HLS on the Alveo U280 FPGA board targeting deformable ResNet-50. Our DSE leads to ∼90% effective hardware utilization, and our design achieves up to 17 (2.2) times higher throughput compared with CPU (GPU) baselines.
Loading