Seeing Beyond Points: Adaptive Gaussian Primitives for 3D Perception

19 Sept 2025 (modified: 13 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Point Cloud Processing, Gaussian Splatting, 3D Perception, Semantic Segmentation, Instance Segmentation
TL;DR: GCept converts raw point clouds into a compact field of adaptive 3‑D Gaussians and, with an alpha‑guided sampler, delivers state‑of‑the‑art 3‑D semantic and instance segmentation.
Abstract: The sparse and discrete nature of point clouds fundamentally limits their effectiveness in perception tasks, as these raw 3D data collections inadequately capture the continuous geometry and detailed appearance of complex real-world scenes. We propose **GCept**, a unified 3D perception framework that evolves raw points into adaptive Gaussian primitives, representing a natural progression in point cloud enrichment. GCept groups spatially proximate points into 3D Gaussians with optimized covariances and spherical harmonics encoding, forming a continuous density field that preserves intricate geometric structures and subtle visual details often lost in traditional pipelines. To enhance representational quality, GCept employs an alpha-guided sampling mechanism that strategically uses compositing weights from Gaussian Splatting to retain only the most informative primitives. The resulting enriched Gaussian representation integrates seamlessly into standard 3D perception backbones, providing richer geometric and appearance information for downstream tasks. Experiments on ScanNet, ScanNet++, ScanNet200, and S3DIS demonstrate state-of-the-art performance in semantic and instance segmentation, effectively bridging 3D reconstruction with robust perception.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 18596
Loading