Masked Graph Attention network for classification of facial micro-expression

Ankith Jain Rakesh Kumar, Bir Bhanu

Published: 2025, Last Modified: 12 Nov 2025Image Vis. Comput. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•We present an automatic landmark-aided dual branch Masked Graph Attention (MaskGAT) Network, which uses a learnable mask for each node to eliminate less important node features and propagates the important node features to the neighboring nodes.•We design a masked self-attention graph pooling layer based on MaskGAT, which provides the attention score to eliminate the least important nodes and uses only the nodes with a high attention score.•We present a mathematical analysis based on entropy and KL-Divergence to rigorously evaluate and visualize the impact of node feature masking and pooling in the MaskGAT pipeline.•We select the frames with high intensity of expression from a video using an adaptive frame selection approach based on a sliding window optical flow method to discard the frames with low intensity of expression. We develop a graph structure to capture the spatial and temporal information using a three-frame graph structure. We utilize a dual branch masked attention network, for node locations and optical flow computation on patch data, and their information fusion.•We provide an extensive assessment of the overall approach on 4 available datasets (SMIC, CASME II, SAMM, and MMEW) for 3, 5, and 7 categories of MEs. We examine our method on cross-datasets to understand its capability for generalization. The experimental results of our approach show an average improvement of 3.66% in accuracy, 2.23% in UF1 score, and 1.19% in UAR over 3 and 5 classes of micro-expression classification.

External IDs:dblp:journals/ivc/KumarB25