Geometrically Consistent Generalizable Splatting

27 Apr 2025 (modified: 29 Oct 2025)Submitted to NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Generalizable Gaussian Splatting, Geometry-Consistent Reconstruction, Self-Supervised Learning, Relative Pose Estimation, Novel View Synthesis
Abstract: Gaussian splatting has emerged as the preferred 3D scene representation due to its incredible speed and accuracy in novel view generation. Various attempts have thus been made to adapt multi-view structure prediction networks to directly predict per-pixel 3D Gaussians from images. However, most work has focused on enhancing self-supervised depth prediction networks to estimate additional parameters for 3D Gaussians -- orientation, scale, opacity, and appearance. We show that optimizing a view-synthesis loss alone is insufficient to recover geometrically meaningful splats in this simple manner. We systematically analyse and address the inherent ambiguities in learning 3D Gaussian splats with self-supervision to learn pose-free generalisable splitting. Our approach achieves state-of-the-art performance in (i) geometrically consistent reconstructions, (ii) relative pose estimation between images, and (iii) novel-view synthesis on the RealEstate10K and ACID datasets. We also showcase zero-shot capabilities of the proposed generalizable splatting on ScanNet, where our method substantially outperforms the prior art in recovering geometry and estimating relative pose.
Supplementary Material: zip
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 4328
Loading