Learning Global Spatial Information for Multi-View Object-Centric Models

Yuya Kobayashi; Masahiro Suzuki; Yutaka Matsuo

Learning Global Spatial Information for Multi-View Object-Centric Models

Yuya Kobayashi, Masahiro Suzuki, Yutaka Matsuo

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: deep generative models, object-centric representation learning, segmentation

Abstract: Recently, several studies have been working on multi-view object-centric models, which predict unobserved views of a scene and infer object-centric representations from several observation views. In general, multi-object scenes can be uniquely determined if both the properties of individual objects and the spatial arrangement of objects are specified; however, existing multi-view object-centric models only infer object-level representations and lack spatial information. This insufficient modeling can degrade novel-view synthesis quality and make it difficult to generate novel scenes. We can model both spatial information and object representations by introducing hierarchical probabilistic model, which contains a global latent variable on top of object-level latent variables. However, how to execute inference and training with that hierarchical multi-view object-centric model is unclear. Therefore, we introduce several crucial components which help inference and training with the proposed model. We show that the proposed method achieves good inference quality and can also generate novel scenes.

One-sentence Summary: Introducing global representation to multi-view object-centric model for further inference quality and for gaining novel scene generation ability.

Supplementary Material: zip

15 Replies

Loading