Toward Designing a Reduced Phone Set Using Text Decoding Accuracy Estimates in Speech BCI

Published: 01 Jan 2025, Last Modified: 04 Aug 2025BIOSTEC (1) 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Reducing the phone set in speech recognition or speech brain-computer interface (BCI) tasks improves phone discrimination accuracy. This reduction may also degrade text decoding accuracy due to increased homonyms. To address this, we propose a novel estimator called the Generalized Pronunciation/Word Confusion Rate (GPWCR), which estimates text decoding accuracy by considering both phone discrimination performance and the number of homonyms. By minimizing the GPWCR, we designed the optimal reduced phone set. Experimental results from Japanese large vocabulary speech recognition demonstrate that the optimal phone set, reduced from 39 to 38 phones, lowered the word error rate from 14.1% to 13.8%.
Loading