Hierarchical Multimodal Transformer with Localness and Speaker Aware Attention for Emotion Recognition in ConversationsOpen Website

2020 (modified: 16 Oct 2021)NLPCC (2) 2020Readers: Everyone
Abstract: Emotion Recognition in Conversations (ERC) aims to predict the emotion of each utterance in a given conversation. Existing approaches for the ERC task mainly suffer from two drawbacks: (1) failing to pay enough attention to the emotional impact of the local context; (2) ignoring the effect of the emotional inertia of speakers. To tackle these limitations, we first propose a Hierarchical Multimodal Transformer as our base model, followed by carefully designing a localness-aware attention mechanism and a speaker-aware attention mechanism to respectively capture the impact of the local context and the emotional inertia. Extensive evaluations on a benchmark dataset demonstrate the superiority of our proposed model over existing multimodal methods for ERC.
0 Replies

Loading