Robust Face Alignment via Inherent Relation Learning and Uncertainty Estimation

Jiahao Xia, Min Xu, Haimin Zhang, Jianguo Zhang, Wenjian Huang, Hu Cao, Shiping Wen

Published: 01 Jan 2023, Last Modified: 03 Nov 2023IEEE Trans. Pattern Anal. Mach. Intell. 2023Readers: Everyone

Abstract: Human tends to locate the facial landmarks with heavy occlusion by their relative position to the easily identified landmarks. The clue is defined as the landmark inherent relation while it is ignored by most existing methods. In this paper, we present Dynamic Sparse Local Patch Transformer (DSLPT), a novel face alignment framework for the inherent relation learning and uncertainty estimation. Unlike most existing methods that regress facial landmarks directly from global features, the DSLPT first generates a rough representation of each landmark from a local patch cropped from the feature map and then adaptively aggregates them by a case dependent inherent relation. Finally, the DSLPT predicts the coordinate and uncertainty of each landmark by regressing their probability distribution from the output features. Moreover, we introduce a coarse-to-fine framework to incorporate with DSLPT for an improved result. In the framework, the position and size of each patch are determined by the probability distribution of the corresponding landmark predicted in the previous stage. The dynamic patches will ensure a fine-grained landmark representation for inherent relation learning so that a rough prediction result can gradually converge to the target facial landmarks. We integrate the coarse-to-fine model into an end-to-end training pipeline and carry out experiments on the mainstream benchmarks. The results demonstrate that the DSLPT achieves state-of-the-art performance with much less computational complexity. The codes and models are available at <uri xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">https://github.com/Jiahao-UTS/DSLPT</uri> .

0 Replies