FacialFlowNet: Advancing Facial Optical Flow Estimation with a Diverse Dataset and a Decomposed Model

Jianzhi Lu; Ruian He; Shili Zhou; Weimin Tan; Bo Yan

FacialFlowNet: Advancing Facial Optical Flow Estimation with a Diverse Dataset and a Decomposed Model

Jianzhi Lu, Ruian He, Shili Zhou, Weimin Tan, Bo Yan

Published: 20 Jul 2024, Last Modified: 05 Aug 2024MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Facial movements play a crucial role in conveying altitude and intentions, and facial optical flow provides a dynamic and detailed representation of it. However, the scarcity of datasets and a modern baseline hinders the progress in facial optical flow research. This paper proposes FacialFlowNet (FFN), a novel large-scale facial optical flow dataset, and the Decomposed Facial Flow Model (DecFlow), the first method capable of decomposing facial flow. FFN comprises 9,635 identities and 105,970 image pairs, offering unprecedented diversity for detailed facial and head motion analysis. DecFlow features a facial semantic-aware encoder and a decomposed flow decoder, excelling in accurately estimating and decomposing facial flow into head and expression components. Comprehensive experiments demonstrate that FFN significantly enhances the accuracy of facial flow estimation across various optical flow methods, achieving up to an 11% reduction in Endpoint Error (EPE) (from 3.91 to 3.48). Moreover, DecFlow, when coupled with FFN, outperforms existing methods in both synthetic and real-world scenarios, enhancing facial expression analysis. The decomposed expression flow achieves a substantial accuracy improvement of 18% (from 69.1% to 82.1%) in micro expressions recognition. These contributions represent a significant advancement in facial motion analysis and optical flow estimation. Codes and datasets will be available to the public.

Primary Subject Area: [Engagement] Emotional and Social Signals

Secondary Subject Area: [Engagement] Emotional and Social Signals

Relevance To Conference: 1) We contribute FacialFlowNet (FFN), a large-scale facial optical flow dataset comprising 105,970 pairs of realistic images from 9,635 identities, along with precise optical flow labels for both facial flow and head flow. 2) We present DecFlow, the first network capable of decomposing facial optical flow into head and expression flow. The decomposed expression flow achieves a substantial accuracy improvement of 18% (from 69.1% to 82.1%) in micro expressions recognition. 3) Extensive experiments demonstrate that FFN significantly enhances the accuracy of facial flow estimation across various optical flow methods, achieving up to 11% reduction in Endpoint Error (EPE) (from 3.91 to 3.48). Moreover, DecFlow outperforms other state-of-the-art methods, providing better insights for the analysis of facial movements.

Supplementary Material: zip

Submission Number: 924

Loading