Abstract: This paper describes two new approaches to spoken language recognition. These were both successfully applied in the NIST 2005 Language Recognition Evaluation. The first approach extends the Gaussian mixture model technique with channel dependency, which results in actual detection costs (C DET ) of 0.095 in NIST LRE-2005, and which should be compared to a traditional 2-gender dependency of GMM language models achieving 0.120. The second approach is a multi-class logistic regression system, which operates similarly to a support vector machine (SVM), but can be trained for all languages simultaneously. This new approach resulted in a C DET of 0.198. The joint TNO-Spescom Datavoice (TNO-SDV) submission to NIST LRE-2005 contained two more systems and obtained a result of 0.0958
Loading