CMANET: Curvature-Aware Soft Mask Guided Attention Fusion Network for 2D+3D Facial Expression Recognition

Abstract: As 2D texture and 3D structural information can describe facial features complementarily, 2D+3D facial expression recognition (FER) has received widespread attention. Though recent methods for 2D+3D FER have reached excellent performance, they still face two challenges: the way for attending to critical face areas and the strategy for fusing multi-modal information. To address these issues, we propose a curvature-aware soft mask guided attention fusion network (CMANet), which mainly consists of two components: curvature-aware attention module and multi-modal attention fusion module. The former utilizes the curvature-aware soft mask guiding the homo-modal attention mechanism to focus on potentially important areas with soft weights, while the latter applies pixel-level fusion on multi-modal features to retain the significant information from different modalities and also allows multi-modal features to interact in a larger field of view. Extensive experimental results show that our CMANet achieves outstanding accuracies (90.24% on BU-3DFE and 89.36% on Bosphorus) and outperforms the state-of-the-art methods.
0 Replies
Loading