Keywords: Emotion recognition, deception detection, facial emotion embedding sequence
Abstract: While multimodal deception detection methods improve detection efficiency, they inevitably introduce higher data collection and processing costs. Deceptive behavior is often accompanied by emotional fluctuations such as tension, anxiety, and guilt, which can lead to contradictory, inconsistent, or suppressed emotional expressions in individuals' facial expressions.This paper regards deceptive behavior detection as an abnormal signal recognition problem, aiming to capture abnormal features from regular behavior patterns. First, faces in videos are converted into a set of learnable facial emotion embedding sequences. Subsequently, a Time-LSTM-GCN module is proposed to model the spatiotemporal relationships between these facial emotion embedding sequences. The combined adversarial loss optimizes the decision boundary for deceptive behaviors. This loss function consists of two main components: first, semi-supervised learning of dominant facial emotions enhances the representational power of the embedding sequence; second, by comparing the similarity between embedding nodes with the same emotion (positive samples) and embedding nodes with different emotions (negative samples), the model is encouraged to capture both local structure within the sequence and global differences between sequences. Experimental results show that our new baseline model outperforms existing deception detection methods based on multimodal or multi-type features. Code is provided in the supplementary material.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 25303
Loading