Keywords: Diffusion Models for Vision, 3D Computer Vision, LiDAR, Point Cloud
Abstract: While significant progress has been made in 2D image generation, the generation of 3D point clouds is less explored. Existing occupancy generation methods usually suffer from resolution limitations, while range image based methods are limited to single-frame rotating-scanning LiDAR points. In this paper, we propose ResGen, a framework towards realistic LiDAR-based point cloud generation. Our method first generates coarse 3D structures and then refines them into high-fidelity point clouds. Specifically, we build a 3D Residual Diffusion Model for refinement. With a pilot study revealing the theoretical shortcomings in existing approaches, we model our diffusion process for a \textit{residual} between the coarse and refined point cloud. ResGen preserves fine-grained details and demonstrates applicability to multi-frame accumulated LiDAR point clouds. Experiments demonstrate that ResGen achieves superior results both qualitatively and quantitatively. Code will be made publicly available.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 6434
Loading