Opine: Leveraging a Optimization-Inspired Deep Unfolding Method for Multi-Channel Speech Enhancement

Andong Li, Rilin Chen, Yu Gu, Chao Weng, Dan Su

Published: 01 Jan 2024, Last Modified: 08 Apr 2025ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Proximal gradient theory has demonstrated its superiority in the compressive sensing field for complex signal recovery. As an early trial in the speech front-end field, we propose OPINE, an optimization-inspired deep unfolding framework to simulate traditional iterative optimization process for multi-channel speech enhancement. Specifically, we formulate the joint optimization of beamforming weights and target speech using the Bayesian maximum a posteriori (MAP) criterion. By splitting and introducing the proximal gradient descent method, the original problem can be formulated into the alternating target solving of two sub-problems. Furthermore, we propose to formulate the proximal function into a more generalized NN-based modules, enabling the end-to-end learning from massive training data. The experiments are conducted on the spatialized LibriSpeech dataset, and quantitative results show that the proposed method can achieve comparable performance over existing advanced baselines.