Abstract: Facial Expression Recognition (FER) plays a crucial role in the real-world applications. However, large-scale FER datasets collected in the wild usually contain noises. More importantly, due to the ambiguity of emotion, facial images with multiple emotions are hard to be distinguished from the ones with noisy labels. Therefore, it is challenging to train a robust model for FER. To address this, we propose Emotion Ambiguity-SEnsitive cooperative networks (EASE) which contain two components. First, the ambiguity-sensitive learning module divides the training samples into three groups. The samples with small-losses in both networks are considered as clean samples, and the ones with large-losses are noisy. Note for the conflict samples that one network disagrees with the other, we distinguish the samples conveying ambiguous emotions from the ones with noises, using the polarity cues of emotions. Here, we utilize KL divergence to optimize the networks, enabling them to pay attention to the non-dominant emotions. The second part of EASE aims to enhance the diversity of the cooperative networks. With the training epochs increasing, the cooperative networks would converge to a consensus. We construct a penalty term according to the correlation between the features, which helps the networks learn diverse representations from the images. Extensive experiments on 6 popular facial expression datasets demonstrate that EASE outperforms the state-of-the-art approaches.
0 Replies
Loading