Abstract: Accurate and real-time LiDAR semantic segmentation is necessary for advanced autonomous driving systems. To guarantee a fast inference speed, previous methods utilize the highly optimized 2D convolutions to extract features on the range view (RV), which is the most compact representation of the LiDAR point clouds. However, these methods often suffer from lower accuracy for two reasons: 1) the information loss during the projection from 3D points to the RV, 2) the semantic ambiguity when 3D points labels are assigned according to the RV predictions. In this work, we introduce an end-to-end point-range fusion network (PRNet) that extracts semantic features mainly on the RV and iteratively fuses the RV features back to the 3D points for the final prediction. Besides, a novel range view projection (RVP) operation is designed to alleviate the information loss during the projection to the RV, and a point-range convolution (PRConv) is proposed to automatically mitigate the semantic ambiguity during transmitting features from the RV back to 3D points. Experiments on the SemanticKITTI and nuScenes benchmarks demonstrate that the PRNet pushes the range-based methods to a new state-of-the-art, and achieves a better speed-accuracy trade-off.
0 Replies
Loading