Keywords: 3D Gaussian、Autonomous Driving、Instance-Level、Dynamic Scene Reconstruction
TL;DR: The proposed approach enables efficient discrimination of 3D Gaussians for instance-level scene understanding without requiring extra 3D annotations or additional networks, and achieves state-of-the-art performance in dynamic scene reconstruction.
Abstract: Current dynamic 3D Gaussian approaches attempt to decompose scene motion by using purely implicit features or additional 3D annotations. However, these strategies hinder fine-grained and interpretable control over Gaussians and overlook the inherent instance consistency, spatial continuity, and local rigidity in scene motion. To this end, we propose a novel framework that achieves MLP-free Gaussian instance distinction and effectively disentangles dynamic urban street scenes. Specifically, we assign each Gaussian a compact multi-hot instance feature, enabling direct differentiation without relying on an additional network. To model transient motions, we initialize sparse control points at the instance level and construct the motion field from coarse to fine by leveraging spatiotemporal relationships. Additionally, we introduce two instance-level losses: an instance-level region loss and an instance-semantic loss. The former, combined with the opacity rendering pipeline, enables precise instance-level rendering and suppresses ghosting artifacts. The latter enforces cross-view feature consistency and optimizes the spatial positions of instances. Notably, our framework avoids costly 3D instance annotations by instead utilizing 2D pseudo-labels generated by Segment Anything Model (SAM) for supervision. Our demo video and code are available in the anonymous repository at https://anonymous.4open.science/r/DDI-GS-750E/.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 6446
Loading