Formant estimation from speech signal using the magnitude spectrum modified with group delay spectrum
Abstract: The magnitude spectrum is a popular mathematical tool for speech signal analysis. In this
paper, we propose a new technique for improving the performance of the magnitude spectrum by
utilizing the benefits of the group delay (GD) spectrum to estimate the characteristics of a vocal tract
accurately. The traditional magnitude spectrum suffers from difficulties when estimating vocal tract
characteristics, particularly for high-pitched speech owing to its low resolution and high spectral
leakage. After phase domain analysis, it is observed that the GD spectrum has low spectral leakage and
high resolution for its additive property. Thus, the magnitude spectrum modified with its GD spectrum,
referred to as the modified spectrum, is found to significantly improve the estimation of formant
frequency over traditional methods. The accuracy is tested on synthetic vowels for a wide range of
fundamental frequencies up to the high-pitched female speaker range. The validity of the proposed
method is also verified by inspecting the formant contour of an utterance from the Texas Instruments
and Massachusetts Institute of Technology (TIMIT) database and standard F2–F1 plot of natural vowel
speech spoken by male and female speakers. The result is compared with two state-of-the-art methods.
Our proposed method performs better than both of these two methods.
Loading