Bangla Voice Command Recognition in end-to-end System Using Topic Modeling based Contextual Rescoring

Abstract: In this work, we perform contextual rescoring using multi-label topic modeling to improve the performance of an End-to-End Bangla voice command recognition system. We use a hybrid of Connectionist Temporal Classification (CTC) and Attention mechanism in our End-to-End architecture. We use Recurrent Neural Network (RNN) as language model and La-beled LDA (Latent Dirichlet allocation) for contextual rescoring. Our experiments show that our rescoring method reduces Word Error Rate (WER) from 16.7% to 12.8% in Bangla voice command recognition task when the relevant context is provided. The system does not lose any performance when irrelevant context is provided.
0 Replies
Loading