Learning feature mapping using deep neural network bottleneck features for distant large vocabulary speech recognition

Ivan Himawan, Petr Motlícek, David Imseng, Blaise Potard, Namhoon Kim, Jaewon Lee

2015 (modified: 07 Nov 2022)ICASSP 2015Readers: Everyone

Abstract: Automatic speech recognition from distant microphones is a difficult task because recordings are affected by reverberation and background noise. First, the application of the deep neural network (DNN)/hidden Markov model (HMM) hybrid acoustic models for distant speech recognition task using AMI meeting corpus is investigated. This paper then proposes a feature transformation for removing reverberation and background noise artefacts from bottleneck features using DNN trained to learn the mapping between distant-talking speech features and close-talking speech bottleneck features. Experimental results on AMI meeting corpus reveal that the mismatch between close-talking and distant-talking conditions is largely reduced, with about 16% relative improvement over conventional bottleneck system (trained on close-talking speech). If the feature mapping is applied to close-talking speech, a minor degradation of 4% relative is observed.

0 Replies