Disfluent Cues for Enhanced Speech Understanding in Large Language Models

Morteza Rohanian; Farhad Nooralahzadeh; Omid Rohanian; David A. Clifton; Michael Krauthammer

Disfluent Cues for Enhanced Speech Understanding in Large Language Models

Morteza Rohanian, Farhad Nooralahzadeh, Omid Rohanian, David A. Clifton, Michael Krauthammer

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX

Submission Type: Regular Long Paper

Submission Track: Speech and Multimodality

Submission Track 2: Linguistic Theories, Cognitive Modeling, and Psycholinguistics

Keywords: disfluency detection, disfluencies, self-repairs, large language models, interruptions, contextual cues, spontaneous speech

Abstract: In computational linguistics, the common practice is to "clean" disfluent content from spontaneous speech. However, we hypothesize that these disfluencies might serve as more than mere noise, potentially acting as informative cues. We use a range of pre-trained models for a reading comprehension task involving disfluent queries, specifically featuring different types of speech repairs. The findings indicate that certain disfluencies can indeed improve model performance, particularly those stemming from context-based adjustments. However, large-scale language models struggle to handle repairs involving decision-making or the correction of lexical or syntactic errors, suggesting a crucial area for potential improvement. This paper thus highlights the importance of a nuanced approach to disfluencies, advocating for their potential utility in enhancing model performance rather than their removal.

Submission Number: 5730

Loading