Abstract: We present ScanNet++, a large-scale dataset that couples together capture of high-quality and commodity-level
geometry and color of indoor scenes. Each scene is captured with a high-end laser scanner at sub-millimeter resolution, along with registered 33-megapixel images from a
DSLR camera, and RGB-D streams from an iPhone. Scene
reconstructions are further annotated with an open vocabulary of semantics, with label-ambiguous scenarios explicitly annotated for comprehensive semantic understanding.
ScanNet++ enables a new real-world benchmark for novel
view synthesis, both from high-quality RGB capture, and
importantly also from commodity-level images, in addition
to a new benchmark for 3D semantic scene understanding
that comprehensively encapsulates diverse and ambiguous
semantic labeling scenarios. Currently, ScanNet++ contains 460 scenes, 280,000 captured DSLR images, and over
3.7M iPhone RGBD frames.
Loading