Renet: A Time-Frequency Domain General Speech Restoration Network for Icassp 2024 Speech Signal Improvement Challenge
Abstract: The ICASSP 2024 Speech Signal Improvement (SSI) Challenge seeks to address speech quality degradation problems in telecommunication systems. In this context, this paper proposes RENet, a time-frequency (T-F) domain method leveraging complex spectrum mapping to mitigate speech distortions. Specifically, the proposed RENet is a multi-stage network. First, TF-GridGAN was designed to recover the degraded speech with a generative adversarial network (GAN). Second, a full-band enhancement module was introduced to eliminate residual noises and artifacts existed in the output of TF-GridGAN. Finally, a lightweight bandwidth extension (BWE) network was implemented to further improve the speech quality by generating high-resolution speeches. Subjective results confirmed the competitive performance of the proposed method under various distortions, and the proposed method ranked the 2nd place in the non-real-time track of the ICASSP 2024 SSI Challenge.
Loading