Robust full-band adaptive Sinusoidal analysis and synthesis of speech

George P. Kafentzis, Olivier Rosec, Yannis Stylianou

Published: 01 Jan 2014, Last Modified: 15 Mar 2024ICASSP 2014Readers: Everyone

Abstract: Recent advances in speech analysis have shown that voiced speech can be very well represented using quasi-harmonic frequency tracks and local parameter adaptivity to the underlying signal. In this paper, we revisit the quasi-harmonicity approach through the extended adaptive Quasi-Harmonic Model - eaQHM, and we show that the application of a continuous f <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> estimation method plus an adaptivity scheme can yield high resolution quasi-harmonic analysis and perceptually indistinguishable resynthesized speech. This method assumes an initial harmonic model which successively converges to quasi-harmonicity. Formal listening tests showed that eaQHM is robust against f <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> estimation artefacts and can provide a higher quality in resynthesizing speech, compared to a recently developed model, called the adaptive Harmonic Model (aHM), and the standard Sinusoidal Model (SM).

0 Replies