Abstract: In this paper, we propose a transfer learning approach to adapt a well-trained model obtained with high-resource materials of one language to another target language using a small amount of adaptation data for speech enhancement based on deep neural networks (DNNs). We investigate the performance degradation issues of enhancing noisy Mandarin speech data using DNN models already trained with only English speech materials, and vice versa. By assuming that the hidden layers of the well-trained DNN regression model as a cascade of feature extractors, we hypothesize that the first several layers should be transferable between languages. Our experimental results indicate that even with only about 1 minute of adaptation data from the resource-limited language we can achieve a considerable performance improvement over the DNN model without cross-language transfer learning.
0 Replies
Loading