On Spatial Features for Supervised Speech Separation and its Application to Beamforming and Robust ASR

Zhong-Qiu Wang, DeLiang Wang

Published: 2018, Last Modified: 12 May 2023ICASSP 2018Readers: Everyone

Abstract: This study integrates complementary spectral and spatial information to elevate deep learning based time-frequency masking and acoustic beamforming. Coherence and directional features are designed as additional input features for deep neural network training to remove diffuse noise and other directional interferences pervasive in real-world recordings. The diffuse and directional features are designed to be relatively invariant to the underlying target direction, number of microphones and microphone geometry. The estimated masks are then utilized to compute steering vectors and spatial covariance matrices for beamforming and robust ASR. Experiments on the CHiME-4 dataset demonstrate the effectiveness of the proposed approach.

0 Replies