A 0.75mm407μW Real-Time Speech Audio Denoiser with Quantized Cascaded Redundant Convolutional Encoder-Decoder for Wearable IoT Devices
Abstract: Audio denoising is crucial for delivering high-quality sound in applications ranging from communication devices to entertainment systems. On-device denoising is critical for en-suring consistent performance across various host platforms. Machine learning (ML) models exhibit strong audio processing performance in the frequency domain but require efficient hardware design. This paper focuses on enhancing audio quality using convolutional encoder-decoder ML models with low power consumption while meeting real-time processing constraints. We achieve this by developing a quantized network that optimally reduces computational costs without compromising enhancement quality. Furthermore, our hardware quantization scheme reduces memory usage by up to 75% while maintaining accuracy. Next, we design a complementary processing element activation routing scheme tailored to our algorithm, significantly reducing the on-chip memory accesses by $\mathbf{5}-\mathbf{9}\times$. Fabricated in 28nm CMOS process, our chip demonstrates real-time audio denoising, processing each frame within 8ms while consuming only $407\ \mathbf{\mu} \mathbf{W}$ or 3.24 μJ/frame at 0.65V, 18.5 MHz, making it ideal for battery-powered IoT devices. In terms of performance, our chip also achieves the highest evaluation score for audio quality (PESQ), outperforming previous works.
External IDs:dblp:conf/vlsid/KocharAC25
Loading