Understanding Emotions: A Dataset of Tweets to Study Interactions between Affect Categories

Saif Mohammad, Svetlana Kiritchenko

2018 (modified: 04 Nov 2022)LREC 2018Readers: Everyone

Abstract: Human emotions are complex and nuanced. Yet, an overwhelming majority of the work in automatically detecting emotions from text has focused only on classifying text into positive, negative, and neutral classes. Our goal is to create a single textual dataset that is annotated for many emotion dimensions (from both the basic emotion model and the VAD model). For each emotion dimension, we annotate tweets for not just coarse classes (such as anger or no anger) but also for fine-grained real-valued scores indicating the intensity of emotion. We use Best-Worst Scaling to address the limitations of traditional rating scale methods such as inter- and intra-annotator inconsistency. We show that the fine-grained intensity scores thus obtained are reliable. The new dataset is useful for training and testing supervised machine learning algorithms for multi-label emotion classification, emotion intensity regression, detecting valence, detecting ordinal class of intensity of emotion (slightly sad, very angry, etc.), and detecting ordinal class of valence. The dataset also sheds light on crucial research questions such as: which emotions often present together in tweets?; how do the intensities of the three negative emotions relate to each other?; and how do the intensities of the basic emotions relate to valence?

0 Replies