Towards a generic approach for automatic speech recognition error detection and classification

Published: 01 Jan 2018, Last Modified: 14 Mar 2025ATSIP 2018EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Automatic Speech Recognition (ASR) errors are essentially unavoidable. This premise motivates the attempts to develop post hoc tools that tackle the ASR errors. This paper addresses the problem of errors in continuous speech recognition outputs to improve the exploitation of ASR transcriptions. We propose a generic classifier-based approach for both error detection and error type classification. Unlike the majority of research in this field, we handle the recognition errors independently from the ASR decoder using a set of features derived exclusively from the recognizer output and hence should be usable with any ASR system. As a result, experiments on TV program transcription data have shown that the proposed non-decoder features setup leads to achieve competitive performances, compared to state of the art systems, in ASR error detection and classification. Furthermore, we have shown that Support Vector Machines trained on the proposed features set appear to be an effective classifier for the ASR error type classification with an Accuracy of 82.41%.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview