Adaptive Decoding for Efficient Automatic Speech RecognitionDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: The latency and computational demand of End-to-end (E2E) automatic speech recognition (ASR) models hinder their deployment on lightweight devices. We find that, although these models can be tuned for efficiency concerns, the computational burden of large vocabularies remains a challenge. In this paper, we propose an adaptive decoding method (ADD) to speed up E2E ASR systems. It segments the vocabulary based on the inherent characteristics of speech, enabling the models to predict each word with a much smaller vocabulary. Our method significantly reduces the FLOPs required for calculations. We also find that the unit-based methods, developed through self-supervised learning, capture acoustic features well and achieve performance comparable to the phone-based methods.
Paper Type: short
Research Area: Speech recognition, text-to-speech and spoken language understanding
Contribution Types: Approaches low compute settings-efficiency
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview