Sparse time-frequency representation of speech by the vandermonde transform

Christian Fischer Pedersen, Tom Bäckström

Published: 2014, Last Modified: 01 Oct 2024INTERSPEECH 2014EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Efficient speech signal representations are prerequisite for efficient speech processing algorithms. The Vandermonde transform is a recently introduced time-frequency transform which provides a sparse and uncorrelated speech signal representation. In contrast, the Fourier transform only decorrelates the signal approximately. To achieve complete decorrelation, the Vandermonde transform is signal adaptive like the Karhunen-Loève transform. Unlike the Karhunen-Loève, however, the Vandermonde transform is a time-frequency transform where the transform domain components correspond to frequency components of the analysis window. In this paper we analyze the performance of sparse speech signal representation by the Vandermonde transform. This is done by applying matching pursuit and comparing with sparse representations based on dictionaries with Fourier, Cosine, Gabor and Karhunen-Loève atoms. Our results show that Karhunen-Loève yields the best sparse signal recovery; however, this is not strictly a time-frequency transform. Of the true time-frequency transforms, Vandermonde is the most efficient for sparse speech signal representation.