Abstract: Anticipating the subjective emotional responses of the user is an interesting capacity for automatic dialogue systems. In this work, given a piece of a dialog, we addressed the problem of predicting the subjective emotional response of the upcoming utterances (i.e. the emotion that will be expressed by the next speaker when the speaker talks). For that, we also take into account, as input, the personality trait of the next speaker. We compare two approaches: a Single-Task architecture (ST) and a Multi-Task architecture (MT). Our hypothesis is that the MT architecture can learn a richer representation of the features that are important to predict emotional reactions. We tested both models using the Personality EmotionLines Dataset (PELD), which is the only publicly available dataset in English that provides individual information about the participants. The results show that our proposed MT approach outperforms both the ST and the state-of-the-art approaches in predicting the subjective emotional response of the next utterance.
Loading