Abstract: Facial expression spotting is an effective metric for categorizing human behavior changes. It refers to the precise localization of the temporal intervals in a sequence where a visual event occurs in a face. In this paper, we propose an innovative framework, which relies on the consistency in terms of orientation and intensity of the local facial motions. First, we build local facial motion consistency maps to differentiate expression-related facial motion from facial noise. Then, these maps are fed into a recurrent neural network to precisely delineate the temporal progression of facial expression activation. Extensive evaluations were undertaken on SNAP-2DFE dataset demonstrating the effectiveness of the proposed framework in temporally segmenting expression activation in presence of low or high head pose variations
Loading