Abstract: This paper focuses on the optimization of model parameters for vocal tract length normalization (VTLN). For maximum likelihood (ML) based normalization techniques, the complexity of the VTL-models is a source of variation in system performance. An optimal complexity for the VTL-model that ensures best global word error rate is proposed. The choice of frequency warping factor also depends on the signal processing step of VTLN. A best set of parameters for the VTLN signal processing stage is proposed with extensive results for an optimal frequency range.
Loading