Abstract: Highlights•We propose a distinct framework to concurrently extract spatial-temporal features for dynamic facial expression recognition.•We introduce a module for spatial-temporal interaction learning and comprehensive facial expression feature extraction.•Our method achieves state-of-the-art results on three benchmarks: DFEW, AFEW, and FERV39k.
Loading