Implementation of Tiled Point-Wise Convolution in MobileNet for Parallel Processing

Published: 01 Jan 2024, Last Modified: 12 Jun 2024ICEIC 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Convolutional neural networks (CNNs) have demonstrated outstanding performance in computer vision tasks. However, their massive computation makes the utilization of CNNs difficult on edge and mobile devices. To address this, lightweight CNNs (e.g., MobileNet) and dedicated FPGA accelerator designs have gained attention. However, the point-wise convolution (PWC) in MobileNet, which accounts for a significant portion of the computations, has a critical impact on latency, making optimization crucial for inference acceleration. In this study, we present a tiled PWC capable of parallel processing by partitioning feature maps into tiles and optimize this operation efficiently. The proposed design enables parallel PWC processing without additional controller modifications. As a result, we implement the proposed design on the Xilinx ZCU102 board platform and observe 4.0 x and 3.1 x improvements in throughput and power efficiency, respectively.
Loading