- Keywords: speech processing, keyword spotting, on-device inference, online inference, keyword spotting hardware, edge AI, low power, deep learning accelerator, TinyML, hardware aware training
- TL;DR: New state-of-the-art performance on streamable SpeechCommands 2.0 on accuracy and size, as well as power on hardware.
- Abstract: Keyword spotting~(KWS) provides a critical user interface for many mobile and edge applications, including phones, wearables, and cars. As KWS systems are typically `always on', maximizing both accuracy and power efficiency are central to their utility. In this work we use hardware aware training~(HAT) to build new KWS neural networks based on the Legendre Memory Unit~(LMU) that achieve state-of-the-art~(SotA) accuracy and low parameter counts. This allows the neural network to run efficiently on standard hardware (212\,$\mu$W). We also characterize the power requirements of custom designed accelerator hardware that achieves SotA power efficiency of 8.79\,$\mu$W, beating general purpose low power hardware (a microcontroller) by 24x and special purpose ASICs by 16x.