Track: long paper (up to 8 pages)
Keywords: representation learning, geometric transformation, full-range head pose estimation
TL;DR: Proposed a novel full-range head pose estimation model through representation learning and geometric transformations
Abstract: We propose a novel framework for representation learning in head pose estimation (HPE) that overcomes the challenges posed by sparse head pose data, which previously made triplet sampling infeasible. Leveraging recent advances in 3D-aware generative adversarial networks (3D GANs), we generate anchor-positive-negative triplets and perform contrastive learning on extensively augmented data, including geometric transformations. This enables the network to learn robust, geometry-aware representations that improve HPE accuracy. We observe that existing HPE models struggle when test images are slightly rotated or flipped, while our method maintains strong performance. Experiments show that our framework matches state-of-the-art models on standard test sets and outperforms them on augmented and full-range poses. Our model handles full-range HPE, accurately predicting head poses across the entire rotation spectrum, including upside-down orientations, and outperforms existing full-yaw range methods.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Submission Number: 88
Loading