- Abstract: This paper introduces the task of semantic instance completion: from an incomplete RGB-D scan of a scene, we aim to detect the individual object instances comprising the scene and infer their complete object geometry. This enables a semantically meaningful decomposition of a scanned scene into individual, complete 3D objects, including hidden and unobserved object parts. This will open up new possibilities for interactions with object in a scene, for instance for virtual or robotic agents. To address this task, we propose 3D-SIC, a new data-driven approach that jointly detects object instances and predicts their completed geometry. The core idea of 3D-SIC is a novel end-to-end 3D neural network architecture that leverages joint color and geometry feature learning. The fully-convolutional nature of our 3D network enables efficient inference of semantic instance completion for 3D scans at scale of large indoor environments in a single forward pass. In a series evaluation, we evaluate on both real and synthetic scan benchmark data, where we outperform state-of-the-art approaches by over 15 in mAP@0.5 on ScanNet, and over 18 in mAP@0.5 on SUNCG.
- Keywords: 3d reconstruction, rgb-d scanning, 3d learning, 3d scene understanding
- TL;DR: From an incomplete RGB-D scan of a scene, we aim to detect the individual object instances comprising the scene and infer their complete object geometry.