I Learned Error, I Can Fix It! : A Detector-Corrector Structure for ASR Error Calibration

Published: 01 Jan 2023, Last Modified: 22 May 2025INTERSPEECH 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Speech recognition technology has improved recently. However, in the context of spoken language understanding (SLU), containing automatic speech recognition (ASR) errors causes significant downstream performance degradation. To address this issue, various ASR error correction methodologies have been proposed. ASR error correction mainly focuses on correcting and generating only the error span using a conditional decoding method. To this end, we propose a structure with a Detector that uses collaborative training to predict various error patterns and a Corrector that corrects the detected error span by Detector. This pipeline reduces Word Error Rate (WER) and shows less performance degradation in downstream tasks compared with the original ASR hypotheses. In addition, it was shown that it could be generalized to various downstream data. By leveraging this Detector-Corrector pipeline, we expect to achieve effective ASR error correction and enable high-quality SLU downstream tasks.
Loading