Artificial stereo data generation for speech feature mapping

Published: 2012, Last Modified: 09 Jan 2026ICASSP 2012EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Feature mapping technique is widely used to eliminate the mismatch between the training and test conditions of speech recognition. In the feature mapping, a target (mismatched) feature vector sequence is mapped closer to the corresponding reference (matched) feature vector stream. The training of the mapping system is usually carried out based on a set of stereo data which consists of simultaneous recordings obtained in both the reference and target conditions. In this paper, we propose a novel approach to blind parameter estimation which does not require the reference feature vectors. The proposed approach is motivated by the hidden Markov model (HMM)-based speech synthesis algorithm.
Loading