URDFormer: Constructing interactive Realistic Scenes from Real Images via Simulation and Generative Modeling

Published: 21 Oct 2023, Last Modified: 21 Oct 2023CoRL 2023 Workshop TGR OralEveryoneRevisionsBibTeX
Keywords: Interactive Scene Generation, URDF Prediction, Generative Modeling
Abstract: Constructing accurate and targeted simulation scenes that are both visually and physically realistic is a significant practical interest in domains ranging from robotics to computer vision. However, this process is typically done largely by hand - a graphic designer and a simulation engineer work together with predefined assets to construct rich scenes with realistic dynamic and kinematic properties. While this may scale to small numbers of scenes, to achieve the generalization properties that are requisite of data-driven machine learning algorithms, we require a pipeline that is able to synthesize large numbers of realistic scenes, complete with “natural” kinematic and dynamic structures. To do so, we develop models for inferring structure and generating simulation scenes from natural images, allowing for scalable scene generation from web-scale datasets. To train these image-to-simulation models, we show how effective generative models can be used in generating training data, the network can be inverted to map from realistic images back to complete scene models. We show how this paradigm allows us to build large datasets of scenes with semantic and physical realism, enabling a variety of downstream applications in robotics and computer vision.
Submission Number: 25
Loading