A 28-nm Energy-Efficient Sparse Neural Network Processor for Point Cloud Applications Using Block-Wise Online Neighbor Searching

Published: 2024, Last Modified: 16 May 2025IEEE J. Solid State Circuits 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Voxel-based point cloud networks composed of multiple kinds of sparse convolutions (SCONVs) play an essential role in emerging applications such as autonomous driving and visual navigation. Many researchers have proposed sparse processors for image applications. However, they cannot properly deal with three problems in the point cloud, including low efficiency of random memory access, non-parallel neighbor search and area overhead of supporting hybrid operators, and unbalanced workload among multiple cores. In this work, a 2-D/3-D unified SCONV accelerator is proposed with three key features: a block-wise sparse data storage format supporting out-of-order memory allocation and continuous memory access; a high-throughput and reconfigurable SCONV core providing unified support for multiple kinds of sparse CNNs; an asynchronous and synchronous hybrid scheduler for multiple cores with dynamic on-chip memory router to maximize data reusing and core utilization. This chip is fabricated in 28-nm CMOS technology and achieves 4.68-TOPS/W peak energy efficiency, 2 $\times $ higher than the previous accelerator. It is also the first accelerator to provide unified 2-D/3-D support and end-to-end inference ability for voxel-based point cloud networks.
Loading