Error-driven pronunciation dictionary construction for Mandarin speech recognition

Published: 2014, Last Modified: 07 Nov 2025ISCSLP 2014EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Aiming at constructing the pronunciation dictionary for Mandarin speech recognition, an automatic error-driven and incremental approach is proposed based on the acoustic confusion network. This method considers both of the acoustic and language information, constructs a dictionary through words selection and composition to optimal the performance of ASR directly. During the process, removing and splitting operations are applied to control the scale of dictionary and avoid to stuck into local optimum. Additionally, it takes advantage of simulated annealing algorithm to obtain the global optimal dictionary. Experiments on Mandarin speech recognition show that the system with the dictionary constructed by the proposed approach gains 1.01% absolute reduction in character error rate compared to the baseline with the same dictionary scale. Besides, the proposed approach can achieve the same performance as best baseline but reduce the size of dictionary from 30000 to 20000.
Loading