Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors

Mateusz Michalkiewicz, Sarah Parisot, Stavros Tsogkas, Mahsa Baktashmotlagh, Anders P. Eriksson, Eugene Belilovsky

2020 (modified: 11 Nov 2022)ECCV (25) 2020Readers: Everyone

Abstract: The impressive performance of deep convolutional neural networks in single-view 3D reconstruction suggests that these models perform non-trivial reasoning about the 3D structure of the output space. Recent work has challenged this belief, showing that complex encoder-decoder architectures perform similarly to nearest-neighbor baselines or simple linear decoder models that exploit large amounts of per-category data, in standard benchmarks. A more realistic setting, however, involves inferring 3D shapes for categories with few available training examples; this requires a model that can successfully generalize to novel object classes. In this work we experimentally demonstrate that naive baselines fail in this few-shot learning setting, where the network must learn informative shape priors for inference of new categories. We propose three ways to learn a class-specific global shape prior, directly from data. Using these techniques, our learned prior is able to capture multi-scale information about the 3D shape, and account for intra-class variability by virtue of an implicit compositional structure. Experiments on the popular ShapeNet dataset show that our method outperforms a zero-shot baseline by over $$50\%$$ and the current state-of-the-art by over $$10\%$$ in terms of relative performance, in the few-shot setting.

0 Replies