Improving latency performance trade-off in keyword spotting applications at the edge

Published: 01 Jan 2023, Last Modified: 04 Nov 2024IWASI 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Keyword Spotting (KWS) is handy in many innovative ambient intelligence applications, such as smart cities and home automation. While solving KWS on GP/GPUs has become a trivial task in recent years, many benefits arise when KWS applications run at the edge (e.g., privacy by design and infrastructure sustainability), where resources are limited. Hardware-aware scaling (HAS) is a novel paradigm that brings neural architectures to low-resource platforms. With HAS, it is possible to optimize neural architectures to fit on embedded platforms (e.g., microcontrollers) while maximizing the performance-complexity tradeoff and the performance-latency tradeoff. This paper shows how HAS, coupled with a neural network with appropriate scaling capabilities, can outperform architectures designed with neural architecture search techniques, such as MCUNet. Our method achieves 94.5% accuracy when classifying the 35 keywords in Google Speech Commands v2, with only 70 ms of latency and overall power consumption of less than 10 mJ.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview