Abstract: An approach to language identification (LID) based on language-dependent phone recognition is presented. A variety of features and their combinations extracted by language-dependent recognizers were evaluated based on the same database. Two novel information sources for LID were introduced: (1) forward and backward bigram based language models, and (2) context-dependent duration models. An LID system using hidden Markov models and neural network was developed. The system was trained and evaluated using the OGLTS database. For a six-language task, the system performance (correct rate) for 45-second long utterances and 10-second long utterances reached 91-96% and 81-08% respectively. The experiments demonstrated the importance of detailed modeling and the method by which these information sources are combined.
Loading