VFL3D: A Single-Stage Fine-Grained Lightweight Point Cloud 3D Object Detection Algorithm Based on Voxels

Published: 2024, Last Modified: 05 Nov 2025IEEE Trans. Intell. Transp. Syst. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In this work, we propose a voxel-based single-stage fine-grained and efficient point cloud 3D object detection algorithm to address the inadequate granularity in point cloud feature extraction tasks and the imbalance between efficiency and accuracy in single-stage point cloud 3D object detection scenarios. We develop a lightweight multibranch cross-sparse convolution network (LMCCN) that is designed to preserve the feature granularity of the original point cloud while achieving enhanced extraction efficiency. Additionally, we introduce a compact fine-grained self-attention augmented bird’s eye view (BEV) feature extraction module (CFSAM). This module aims to further refine BEV features, enabling the acquisition of both locally and globally enhanced features and thereby augmentingthe perceptual capabilities of the constructed model. Without bells and whistles, the proposed method attains excellent performance on many autonomous driving benchmarks, with detection accuracies of up to 81.67% on KITTI, 72.74% on ONCE, and 84.00% on nuScenes. Moreover, it reaches a peak detection speed of 46.08 FPS, effectively balancing accuracy with speed.
Loading