A statistical framework for artificial bandwidth extension exploiting speech waveform and phonetic transcription

Published: 2009, Last Modified: 13 May 2025EUSIPCO 2009EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the past, artificial bandwidth extension (ABWE) has primarily been investigated to enhance transmitted narrowband speech signals at the receiving side. State-of-the-art schemes show improved quality versus narrowband speech; however, a clear gap to wideband speech is still reported. This is largely due to the insufficient ABWE performance on fricatives, particularly /s/. We asked ourselves to what extent the speech quality could be improved, if we knew the currently spoken phoneme. In this paper we present a framework using phonetic transcriptions as a-priori knowledge besides the speech waveform. Possible applications are high-quality offline ABWE of telephone, pilot, or historic speech recordings, memory efficient narrowband speech synthesis followed by ABWE, and extension of narrowband telephone databases to train wideband acoustic models for automatic speech recognition. For the classical conversational telephony application, an improved ABWE scheme is also proposed making use of transcription information only during training.
Loading