Emosical: An Emotion Annotated Musical Theatre Dataset

ACL ARR 2024 June Submission1154 Authors

14 Jun 2024 (modified: 06 Aug 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: This paper presents Emosical, a multimodal open-source dataset of musical films. Emosical comprises video, vocal audio, text, and character identity paired samples with annotated emotion tags. Emosical provides rich emotion annotations for each sample by inferring the background story of the characters. To derive the emotion tags, we leverage the musical theater script, which contains the characters' complete background stories and narrative contexts. The annotation pipeline includes feeding the singing character, text, global persona, and context of the dialogue and song track into a large language model (LLM). To verify the effectiveness of our tagging scheme, we perform an ablation study by bypassing each step of the pipeline. A subjective test is conducted to compare the generated tags of each ablation result. We also perform a statistical analysis to find out the global characteristics of the collected emotion tags. Emosical would enable expressive synthesis and tagging of the singing voice in the musical theatre domain in future research.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: multimodal applications
Contribution Types: Data resources
Languages Studied: english
Submission Number: 1154
Loading