Using Paralinguistic Information to Disambiguate User Intentions for Distinguishing Phrase Structure and Sarcasm in Spoken Dialog SystemsDownload PDFOpen Website

2021 (modified: 09 Nov 2021)SLT 2021Readers: Everyone
Abstract: This paper aims at utilizing paralinguistic information usually hidden in speech signals, such as pitch, short pause and sarcasm, to disambiguate user intention not easily distinguishable from speech recognition and natural language understanding results provided by a state-of-the-art spoken dialog system (SDS). We propose two methods to address the ambiguities in understanding name entities and sentence structures based on relevant speech cues and nuances. We also propose an approach to capturing sarcasm in speech and generating sarcasm-sensitive responses using an end-to-end neural network. An SDS prototype that directly feeds signal information into the understanding and response generation components has also been developed to support the three proposed applications. We have achieved encouraging experimental results in this initial study, demonstrating the potential of this new research direction.
0 Replies

Loading