Abstract: We present a new descriptor for spontaneous facial expression recognition from videos acquired by a thermal sensor. Previous descriptors mostly compute features from RGB videos. It is difficult to process mixed and varied spontaneous expressions with a large ambiguity of facial appearances. In contrast, thermal imaging can measure autonomic activities, which are the physiological changes evoked by the autonomic nervous system regardless of the variety and ambiguity of facial appearances. This paper presents a new thermal video representation as so-called trajectory-pooled fisher vector descriptor (TFD). To get the local energy and temperature changes, we propose to use spatio-temporal orientation energy and acceleration of dense trajectory as low level features and further improve the discriminative capacity by aggregating the local feature using an improved fisher vector. The benefits of TFD in comparison with existing approaches are illustrated in two databases using different modalities: USTC-NVIE database and MMSE (a.k.a. BP4D+) database.
Loading