The Aligned Multimodal Movie Treebank: An audio, video, dependency-parse treebank

Anonymous

The Aligned Multimodal Movie Treebank: An audio, video, dependency-parse treebank

Anonymous

16 Jan 2022 (modified: 05 May 2023)ACL ARR 2022 January Blind SubmissionReaders: Everyone

Abstract: Treebanks have traditionally included only text and were derived from written sources such as newspapers or the web. We introduce the Aligned Multimodal Movie Treebank, an English language treebank derived from naturalistic dialog in Hollywood movies which includes the source video and audio, transcriptions with word-level alignment to the audio stream, as well as part of speech tags and dependency parses in the Universal Dependencies formalism. AMMT consists of 31,264 sentences and 218,090 words, that will be the 3rd largest UD English treebank, and the only multimodal treebank in UD. To help with the web-based annotation effort, we also introduce the Efficient Audio Alignment Annotator (EAAA), a companion tool that enables annotators to speed-up significantly the annotation process.

Paper Type: short

0 Replies

Loading