Pix2Scene: Learning Implicit 3D Representations from Images

Sai Rajeswar; Fahim Mannan; Florian Golemo; David Vazquez; Derek Nowrouzezahrai; Aaron Courville

Pix2Scene: Learning Implicit 3D Representations from Images

Sai Rajeswar, Fahim Mannan, Florian Golemo, David Vazquez, Derek Nowrouzezahrai, Aaron Courville

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Modelling 3D scenes from 2D images is a long-standing problem in computer vision with implications in, e.g., simulation and robotics. We propose pix2scene, a deep generative-based approach that implicitly models the geometric properties of a scene from images. Our method learns the depth and orientation of scene points visible in images. Our model can then predict the structure of a scene from various, previously unseen view points. It relies on a bi-directional adversarial learning mechanism to generate scene representations from a latent code, inferring the 3D representation of the underlying scene geometry. We showcase a novel differentiable renderer to train the 3D model in an end-to-end fashion, using only images. We demonstrate the generative ability of our model qualitatively on both a custom dataset and on ShapeNet. Finally, we evaluate the effectiveness of the learned 3D scene representation in supporting a 3D spatial reasoning.

Keywords: Representation learning, generative model, adversarial learning, implicit 3D generation, scene generation

TL;DR: pix2scene: a deep generative based approach for implicitly modelling the geometrical properties of a 3D scene from images

Data: [ShapeNet](https://paperswithcode.com/dataset/shapenet)

21 Replies

Loading