Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction

David Klee; Ondrej Biza; Robert Platt; Robin Walters

Image to Sphere: Learning Equivariant Features for Efficient Pose Prediction

David Klee, Ondrej Biza, Robert Platt, Robin Walters

Published: 01 Feb 2023, Last Modified: 27 Apr 2025ICLR 2023 notable top 5%Readers: Everyone

Keywords: equivariance, sample efficiency, pose detection, symmetry, SO(3)

Abstract: Predicting the pose of objects from a single image is an important but difficult computer vision problem. Methods that predict a single point estimate do not predict the pose of objects with symmetries well and cannot represent uncertainty. Alternatively, some works predict a distribution over orientations in $\mathrm{SO}(3)$. However, training such models can be computation- and sample-inefficient. Instead, we propose a novel mapping of features from the image domain to the 3D rotation manifold. Our method then leverages $\mathrm{SO}(3)$ equivariant layers, which are more sample efficient, and outputs a distribution over rotations that can be sampled at arbitrary resolution. We demonstrate the effectiveness of our method at object orientation prediction, and achieve state-of-the-art performance on the popular PASCAL3D+ dataset. Moreover, we show that our method can model complex object symmetries, without any modifications to the parameters or loss function. Code is available at \url{https://dmklee.github.io/image2sphere}.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning

TL;DR: We propose a novel architecture which efficiently describes uncertainty in pose estimation from images by using learned SO(3)-equivariant features to generate complex distributions over SO(3) with the Fourier basis.

Supplementary Material: zip

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 4 code implementations](https://www.catalyzex.com/paper/image-to-sphere-learning-equivariant-features/code)

11 Replies

Loading