GaussianDet3D: Bridging Gaussian Splatting and Sparse LiDAR Detection for Multi-View 3D Object Detection
Keywords: 3D Computer Vision, 3D Object Detection, 3D Gaussian Splatting, Camera-based 3D perception
Abstract: Accurate 3D object detection from cameras alone remains a fundamental challenge in autonomous driving, particularly for precise localization and velocity estimation, two metrics critical for safe trajectory planning and collision avoidance. Existing camera-based methods lift image features into dense Bird's-Eye View (BEV) grids, which struggle to capture fine-grained geometry and motion cues.
We present GaussianDet3D, the first method, to the best of our knowledge, to apply 3D Gaussian Splatting from multi-view images to 3D object detection in the context of autonomous driving, treating predicted Gaussian primitives as a pseudo-LiDAR point cloud fed directly into a sparse LiDAR detector. Unlike a LiDAR point which carries only coordinates and intensity, each Gaussian encodes parameters capturing geometry, orientation, opacity, and per-class semantic distributions. By aggregating Gaussian point clouds across multiple frames, GaussianDet3D captures temporal motion cues that enable precise velocity estimation without explicit tracking. On the nuScenes benchmark, GaussianDet3D achieves state-of-the-art translation error and velocity error among all camera-based methods, outperforming BEVFormer by 8.1% and 13.1% respectively, while remaining competitive in overall detection score. These results demonstrate that Gaussian Splatting provides a geometrically precise, semantically rich representation that bridges the gap between image-based perception and LiDAR-quality spatial reasoning, particularly for the localization and motion estimation tasks most critical to autonomous driving safety. Code and checkpoints will be made publicly available upon publication.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 11
Loading