A synchronized pruning composition algorithm of weighted finite state transducers for large vocabulary speech recognition

Zhiyang He, Ping Lv, Wei Li, Ji Wu

Published: 2012, Last Modified: 13 May 2025ISCSLP 2012EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The use of weighted finite state transducer (WFST) has been a very attractive approach for large vocabulary continuous speech recognition(LVCSR). Composition is an important operation for combining different levels of WFSTs. However, the general composition algorithm may generate non-coaccessible states, which may require a large amount of memory space, especially for LVCSR applications. The general composition algorithm doesn't remove these non-coaccessible states and related transitions until composition is finished. This paper proposes an improved depth-first composition algorithm, which analyzes the property of each new generated state during the composition and removes almost all of the non-coaccessible states and related transitions timely. As a result, the requirement of memory for WFSTs' composition can be significantly decreased. Experimental results on Chinese Broadcast News(41022 words) task show that a reduction of 20% - 26% in memory space can be achieved with an increase of about 5% in the time complexity.