Deep Kronecker Product Beamforming for Large-Scale Microphone Arrays

Published: 01 Jan 2024, Last Modified: 08 Apr 2025IEEE ACM Trans. Audio Speech Lang. Process. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Although deep learning based beamformers have achieved promising performance using small microphone arrays, they suffer from performance degradation in very challenging environments, such as extremely low Signal-to-Noise Ratio (SNR) environments, e.g., SNR $\le$−10 dB. A large-scale microphone array with dozens or hundreds of microphones can improve the performance of beamformers in these challenging scenarios because of its high spatial resolution. While a dramatic increase in the number of microphones leads to feature redundancy, causing difficulties in feature extraction and network training. As an attempt to improve the performance of deep beamformers for speech extraction in very challenging scenarios, this paper proposes a novel all neural Kronecker product beamforming denoted by ANKP-BF for large-scale microphone arrays by taking the following two aspects into account. Firstly, a larger microphone array can provide higher performance of spatial filtering when compared with a small microphone array, and deep neural networks are introduced for their powerful non-linear modeling capability in the speech extraction task. Secondly, the feature redundancy problem is solved by introducing the Kronecker product rule to decompose the original one high-dimension weight vector into the Kronecker product of two much lower-dimensional weight vectors. The proposed ANKP-BF is designed to operate in an end-to-end manner. Extensive experiments are conducted on simulated large-scale microphone-array signals using the DNS-Challenge corpus and WSJ0-SI84 corpus, and the real recordings in a semi-anechoic room and outdoor scenes are also used to evaluate and compare the performance of different methods. Quantitative results demonstrate that the proposed method outperforms existing advanced baselines in terms of multiple objective metrics, especially in very low SNR environments.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview