Temporal Correlation Based Speech Feature Processing and its Application to Speaker Recognition

Xiaofei Xie, ChengGong Yu

Published: 2006, Last Modified: 17 Jul 2025SMC 2006EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The speech signal is continuous, however the feature vectors which are extracted from the signal are separate each other. If we can exploit the dynamics and time correlation of speech feature vectors, the performance of speech recognition and speaker recognition should be improved. Segment model is proposed for explicitly modeling the dynamic information between feature vectors and it gets a good result in speech recognition. Here we discuss the temporal correlation exploitation for speaker recognition. We modify the procedure of feature extraction based on segment model, and these feature vectors contain the more correlative information than the quondam ones. In this paper, several modification methods of feature extraction are compared. The experimental works were done on three speech database: YOHO corpus, phone database and SRMC database.