Detecting actionable items in meetings by convolutional deep structured semantic models

Yun-Nung Chen, Dilek Hakkani-Tür, Xiaodong He

2015 (modified: 24 Apr 2023)ASRU 2015Readers: Everyone

Abstract: The recent success of voice interaction with smart devices (human-machine genre) and improvements in speech recognition for conversational speech show the possibility of conversation-related applications. This paper investigates the task of actionable item detection in meetings (human-human genre), where the intelligent assistant dynamically provides the participants access to information (e.g. scheduling a meeting, taking notes) without interrupting the meetings. A convolutional deep structured semantic model (CDSSM) is applied to learn the latent semantics for human actions and utterances from human-machine (source genre) and human-human (target) interactions. Furthermore, considering the mismatch between source and target genre and scarcity of annotated data sets for the target genre, we develop adaptation techniques that adjust the learned embeddings to better fit the target genre. Experiments show that CDSSM performs better for actionable item detection compared to baselines using lexical features (27.5% relative) and other semantic features (15.9% relative) when the source genre and target genre match with each other. When the target genre mismatches with the source genre, our proposed adaptation techniques further improve the performance. The discussion and analysis of the experiments provide a reasonable direction for such an actionable item detection task1.

0 Replies