C2A-SLU: Cross and Contrastive Attention for Improving ASR Robustness in Spoken Language Understanding

Xuxin Cheng, Ziyu Yao, Zhihong Zhu, Yaowei Li, Hongxiang Li, Yuexian Zou

17 Dec 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: Spoken language understanding (SLU) is a critical task in task-oriented dialogue systems. However, automatic speech recognition (ASR) errors often impair the understanding performance. Despite many previous models have obtained promising results for improving ASR robustness in SLU, most of them treat clean manual transcripts and ASR transcripts equally during the finetuning stage. To tackle this issue, in this paper, we propose a novel method termed C2A-SLU. Specifically speaking, we add calculated cross attention to the original hidden states and apply contrastive attention to compare the input transcript with clean manual transcripts to distill the contrastive information, which can better capture distinctive features of ASR transcripts. Experiments on three datasets show that C2A-SLU surpasses existing models and achieves a new state-of-the-art performance, with a relative improvement of 3.4% in terms of accuracy over the previous best model on SLURP dataset.

0 Replies