Abstract: In this paper, the feasibility of designing a speech-recognition based telephony server for in-car applications with an acceptable recognition rate is investigated. The whole acoustic channel (sound pickup, sound transmission over the cellular network, feature extraction) is evaluated: the loss or the gain in performance due to each element is quantified. More precisely, two sound pickup systems (a hypercardioid microphone and a microphone array) were tested. A standard MFCC and the Aurora advanced front-ends were evaluated. Recognition performance was measured before and after transmission over a cellular (GSM) network. The gain of using either a robust sound recording device or noise robust front-end is demonstrated.
Loading