Abstract: This paper presents a combined two-component system for language identification based on phonotactic and acoustic features. The phonotactic part consisting of a multilingual phone-recognizer with a double bigram-decoding architecture and a phonetic-context mapping is supported by a second part with pronunciation modeling of the recognized phone-sequence using Gaussian density models. Both parts are post-processed by a neural-based final classifier. Measured on the NIST'95 evaluation set, the described system outperforms state-of-the-art components and, at the same time, requires considerably less computational expense, as compared to implicit phonotactic-acoustic modeling and parallel recognizer architectures.
0 Replies
Loading