Speech recognition in the blind condition based on multiple directivity patterns using a microphone array

Toshiyuki Sekiya, Tetsunori Kobayashi

Published: 01 Jan 2005, Last Modified: 05 Jun 2024ICASSP (1) 2005EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The proposed system is constructed by the cascade of the sound localization system, MUSIC, and the sound segregation system, SMDP (segregation using multiple directivity patterns) proposed in our previous paper. SMDP is characterized by using redundant directivity patterns. Usually, it is difficult for this sort of cascade system to achieve high performance because the sound localization stage cannot be perfect and errors occurring in this first stage cause serious damage to the segregation stage. Particularly, missing the sound source is critical. By arranging virtual sound sources, we deal with the excess sound sources. In the proposed method, contrarily, the errors in the localization stage hardly cause problems as long as they are insertions. SMDP uses redundant directivity patterns from the beginning, so it tolerates insertion errors. The proposed method achieved 70% word accuracy in a double-talk recognition experiment using a 20 K vocabulary, which is 18% better compared to ICA-based blind source separation, with the source-number-given condition.