Stylize and Align: Unlabeled-Image Stylized Continuous Consistency Regularization for Hand Pose Estimation in the Wild

23 Sept 2024 (modified: 14 Nov 2024)ICLR 2025 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Pose estimation, 3D hand pose estimation, Consistency regularization, Style transfer, Robustness, Metric learning
TL;DR: We introduce a method combining style augmentation and continuous consistency regularization to boost 3D hand pose estimation in real-world settings, achieving significant improvements with less than 5% of the data used by InterHand.
Abstract: Hand pose estimation has become a cornerstone of advanced human behavior understanding. In particular, 3D hand pose estimation has seen significant attention, with numerous approaches being proposed. However, it is unclear whether the modern approaches are applicable to real-world scenarios directly. We are focused on the robustness of hand pose estimators in the wild, noting that existing datasets exhibit distinct differences from real-world data. Thus, despite great advances, there remains considerable room for improvement, as most recent efforts have primarily focused on model architectures or on datasets within limited environments. To this end, we present a novel approach that unifies two key techniques: style transfer using unlabeled in-the-wild images to enhance data diversity (\ie, Stylize) and continuous consistency regularization (CCR) to capture fine-grained relations between hand pose data, providing rich supervisory signals (\ie, Align). To evaluate the robustness of the learned representations through our framework, we demonstrate that our method significantly enhances generalization capabilities across various tasks, including 3D hand pose estimation and transfer learning for 2D hand pose estimation, all within our designed real-world testbed. Notably, these improvements are achieved using less than 5\% of the data size compared to a large-scale dataset, InterHand2.6M.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2797
Loading