Finding Speaker Identities with a Conditional Maximum Entropy Model

Chengyuan Ma, Patrick Nguyen, Milind Mahajan

Published: 2007, Last Modified: 12 May 2023ICASSP (4) 2007Readers: Everyone

Abstract: In this paper, we address the task of identifying the speakers by name in audio content. Identification of speakers by name helps to improve the readability of the transcript and also provides additional meta-data which can help in finding the audio content of interest. We present a conditional maximum entropy (maxent) framework for this problem which yields superior performance and lends itself well to incorporating different types of information. We take advantage of this property of maxent to explore new features for this task. We show that supplementing standard lexical triggers with information such as speaker gender and position of speaker name mentions afford us large gains in performance. At 95% precision, we increase the recall to 67% from the trigger baseline of 38%.

0 Replies