Masked facial expression recognition based on temporal overlap module and action unit graph convolutional network
Abstract: Facial expressions may not truly reflect genuine emotions of people . People often use masked facial expressions
(MFEs) to hide their genuine emotions. The recognition of MFEs can help reveal these emotions, which has
very important practical value in the field of mental health, security and education. However, MFE is very
complex and lacks of research, and the existing facial expression recognition algorithms cannot well recognize
the MFEs and the hidden genuine emotions at the same time. To obtain better representations of MFE, we
first use the transformer model as the basic framework and design the temporal overlap module to enhance
temporal receptive field of the tokens, so as to strengthen the capture of muscle movement patterns in MFE
sequences. Secondly, we design a graph convolutional network (GCN) with action unit (AU) intensity as node
features and the 3D learnable adjacency matrix based on AU activation state to reduce the irrelevant identity
information introduced by image input. Finally, we propose a novel end-to-end dual-stream network combining
the image stream (transformer) with the AU stream (GCN) for automatic recognition of MFEs. Compared with
other methods, our approach has achieved state-of-the-art results on the core tasks of Masked Facial Expression
Database (MFED).
Loading