Abstract: This paper presents a novel model for head-pose estimation from a single image with a compact model size. Previous state-of-the-art methods often rely on large training models and converge slowly on standard GPUs. In this paper, we introduce attention-guided soft ranking loss that reduces the size of the state-of-the-art method while increasing its performance. Specifically, we design an attention module to encourage learning on salient features. In addition, we propose a pair-wise soft ranking loss that supervises the model with paired samples and penalizes incorrect ordering of head-pose prediction. Considering the lack of large-pose data, we also introduce a minority head-pose oversampling algorithm to balance the distribution of yaw, pitch, and roll angles. Experiments on BIWI and AFLW2000 datasets demonstrate that our approach significantly outperforms the state-of-the-art methods. Extensive ablation studies further validate the effectiveness and robustness of the design of our framework. Code will be made available l .
Loading