Part321: Recognizing 3D Object Parts from a 2D Image Using 1-Shot Annotations

27 Sept 2024 (modified: 15 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: 3D Vision, 3D from 2D, Part recognition, One-shot
TL;DR: We achieve category-level 3D object parts recognition from a single 2D image using one-shot 3D annotations via analysis-by-synthesis.
Abstract: Recognizing object parts from images plays a pivotal role in various real-world applications. However, existing work mostly learn models from large-scale 2D part annotations. In this paper, we propose a part recognition model that can recognize 3D parts from a 2D image with only annotations of parts on one 3D mesh model for each object category. Specifically, we build a category-level 3D feature bank for meshes that could overcome geometric variance among objects and precisely align with diverse 2D images of this object category. To achieve this, we propose to learn two types of correspondence. Firstly, we learn mesh-to-mesh correspondence between distinct 3D mesh models by matching geometry-aware features, which allows us to create a shared 3D feature bank for this object category. Secondly, we establish mesh-to-image correspondence by aligning features in the 3D feature bank with features extracted from 2D images. During inference, given a single image, our method recognizes 3D object parts via a Render-and-Compare approach. It predicts object parts by gradient-based optimizing each part’s 3D configuration, minimizing a feature-level reconstruction loss between the projected 3D features and the image features while ensuring geometric consistency between object parts. The position, rotation, and shape of each part are optimized to match the cues from the image, thus recognizing the 3D parts from a 2D image. Experiments on VehiclePart3D, PartImageNet, and UDA Part dataset show our method outperforms baselines significantly for 2D part segmentation and pioneering 3D part recognition from a single image.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8437
Loading