Speech Analysis Based on Modeling the Effective Voice Source
Abstract: A new system identification based method has been proposed
for accurate estimation of vocal tract parameters. An often encountered
problem in using the conventional linear prediction analysis is due to
the harmonic structure of the excitation source of voiced speech. This harmonic
characteristic is coupled with the estimation of autoregressive (AR)
coefficients that results in difficulties in estimating the vocal tract filter. This
paper models the effective voice source from the residual obtained through
the covariance analysis in the first-pass which is then used as input to the
second-pass least-square analysis. A better source-filter separation is thus
achieved. The formant frequencies and corresponding bandwidths obtained
using the proposed method for synthetic vowels are found to be accurate
up to a factor of more than three (in percent) compared to the conventional
method. Since the source characteristic is taken into account, local variations
due to the positioning of analysis window are reduced significantly.
The validity of the proposed method is also examined by inspecting the
spectra obtained from natural vowel sounds uttered by high-pitched female
speaker.
Loading