LaRI: Layered Ray Intersections for Single-view 3D Geometric Reasoning

14 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: 3D reconstrction, unseen scene reconstruction, depth estimation, point maps
TL;DR: A single-feed-forward method that models unseen 3D geometry using layered point maps, achieving fast and accuate reconstruction in both object and scene tasks.
Abstract: We present Layered Ray Intersections (LaRI), a fully supervised method for occluded geometry reasoning from a single image. Unlike conventional depth estimation, which is limited to visible surfaces, LaRI predicts multiple surfaces intersected by the camera rays using layered point maps. Compared to the existing approaches that leverage neural implicit representations or iterative refinement, LaRI achieves complete scene reconstruction in one feed-forward pass, enabling efficient and view-aligned geometric reasoning to underpin both object-level and scene-level tasks. We further propose to predict the ray stopping index, which identifies valid intersecting pixels and layers from LaRI’s output. To better underpin and evaluate this task, we build an annotation pipeline using rendering engines, construct annotations for five public datasets, including synthetic and real-world data covering 3D objects and scenes. As a generic method, LaRI’s performance is validated in object-level and scene-level reconstruction tasks.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 5273
Loading