Image to Sphere: Learning Equivariant Features for Efficient Pose PredictionDownload PDF

Published: 01 Feb 2023, Last Modified: 08 Sept 2024ICLR 2023 notable top 5%Readers: Everyone
Keywords: equivariance, sample efficiency, pose detection, symmetry, SO(3)
Abstract: Predicting the pose of objects from a single image is an important but difficult computer vision problem. Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty. Alternatively, some works predict a distribution over orientations in $\mathrm{SO}(3)$. However, training such models can be computation- and sample-inefficient. Instead, we propose a novel mapping of features from the image domain to the 3D rotation manifold. Our method then leverages $\mathrm{SO}(3)$ equivariant layers, which are more sample efficient, and outputs a distribution over rotations that can be sampled at arbitrary resolution. We demonstrate the effectiveness of our method at object orientation prediction, and achieve state-of-the-art performance on the popular PASCAL3D+ dataset. Moreover, we show that our method can model complex object symmetries, without any modifications to the parameters or loss function. Code is available at \url{https://dmklee.github.io/image2sphere}.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
TL;DR: We propose a novel architecture which efficiently describes uncertainty in pose estimation from images by using learned SO(3)-equivariant features to generate complex distributions over SO(3) with the Fourier basis.
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 4 code implementations](https://www.catalyzex.com/paper/image-to-sphere-learning-equivariant-features/code)
11 Replies

Loading