Lightweight U-Net Based Monaural Speech Source Separation for Edge Computing Device

Kwang Myung Jeon, Chanjun Chun, Gyeongbong Kim, Chaejun Leem, Bohee Kim, Wooyeol Choi

Published: 01 Jan 2020, Last Modified: 05 Nov 2023ICCE 2020Readers: Everyone

Abstract: In this paper, a lightweight U-Net based monaural speech source separation method to implement high-quality speech source separation functionality in an edge computing device having a monaural microphone is proposed. The proposed method utilizes U-shaped neural networks to segregate speech and interfering noises from input mixtures in the time-frequency domain. To reduce the sizes of the networks suitable for real-time operation at the resource-constrained edge-computing device, the proposed method employs the inception-like multi-lane dimensionality reduction module for each convolutional layer of the U-Net. The performance of the proposed method is evaluated in terms of separation quality and number of parameters. Compared with the conventional U-Net based speech separation model, the proposed lightweight U-Net based method achieved a performance almost on-par with those of the conventional one while using a model footprint of 1.39 MB, which is only 3.72% of the size of the conventional U-Net. Moreover, the proposed method is successfully implemented in an off-the-shelf edge computing device having a tensor processing unit.

0 Replies