Towards Dialogue Modeling Beyond Text

Published: 05 May 2023, Last Modified: 07 Mar 2025ICASSP 2023EveryoneCC BY 4.0
Abstract: In this paper, we model aspects of communication beyond the words that are said. Specifically, we aim to detect interruptions and active listening events, which are important elements in any dialogue. We build a dataset with fine-grained annotations for each category and train multimodal models that take into account all channels in a digital conversation, that is, the video, the audio, and the text. Our experiments show that multimodality is a necessary component in modeling the complexity of the non-textual components of the conversation as different artifacts require different modalities to capture effectively.
Loading