Regarding Topology and Adaptability in Differentiable WFST-Based E2E ASR

Published: 01 Jan 2024, Last Modified: 15 May 2025ICASSP Workshops 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The adaptability of End-to-End (E2E) Automatic Speech Recognition (ASR) models across diverse datasets remains a challenge, often attributed to acoustic model (AM) generalisability and the internal language model (ILM) mismatch. This study delves into the impact of topology on adaptability in Differentiable WFST-based ASR. Through evaluations on various ASR corpora, we discern a significant influence of topology on adaptability. Notably, Connectionist Temporal Classification’s performance diminishes with substantial acoustic feature deviations from its training set. Additionally, we confirm that the internal language models within these topologies are sufficiently weak, indicating that acoustic model generalisability is the primary factor influencing adaptability.
Loading