DreamUp3D: Object-Centric Generative Models for Single-View 3D Scene Understanding and Real-to-Sim Transfer
Keywords: Object-Centric Generative Model, 3D Scene Understanding, Robotics
TL;DR: DreamUp3D is an Object-Centric Generative Model that performs inference on RGB-D images and provides a structured output that includes object segmentations, 3D reconstructions, 6D pose estimates and latent embeddings.
Abstract: Reasoning about scenes at an object-centric level is an important abstraction for robots and embodied agents. This abstraction is both interpretable whilst also providing a structured representation useful for reasoning about the physical world. While robots operate in 3D environments and 3D sensors are ubiquitous, it is typical for object-centric representations to use only 2D data, e.g., images and video, to decompose the world. On top of the requirement for 3D scene understanding, object-centric models for embodied agents would also ideally operate in real-time on real-world data, provide informative latent representations, approximate the 6D pose of each object and output 3D reconstructions of individual objects and the full scene. DreamUp3D satisfies all of these requirements.
DreamUp3D is a novel Object-Centric Generative Model (OCGM) which is trained end-to-end in a self-supervised manner. The model is designed to operate over real RGB-D images, utilising both vision and depth to reason about 3D scenes. At inference, DreamUp3D runs in real time, providing object segmentations, 3D reconstructions, 6D pose estimates and latent embeddings. This functionality makes DreamUp3D ideal for real-world deployments requiring object-centric representations.
Submission Number: 8
Loading