Low-bit quantization and quantization-aware training for small-footprint keyword spotting

Yuriy Mishchenko; Yusuf Goren; Ming Sun; Chris Beauchene; Spyros Matsoukas; Oleg Rybakov; Shiv Naga Prasad Vitaladevuni

Low-bit quantization and quantization-aware training for small-footprint keyword spotting

Yuriy Mishchenko, Yusuf Goren, Ming Sun, Chris Beauchene, Spyros Matsoukas, Oleg Rybakov, Shiv Naga Prasad Vitaladevuni

19 Oct 2018 (modified: 05 May 2023)NIPS 2018 Workshop CDNNRIA Blind SubmissionReaders: Everyone

Abstract: We investigate low-bit quantization to reduce computational cost of deep neural network (DNN) based keyword spotting (KWS). We propose approaches to further reduce quantization bits via integrating quantization into keyword spotting model training, which we refer to as quantization-aware training. Our experimental results on large dataset indicate that quantization-aware training can recover performance models quantized to lower bits representations. By combining quantization-aware training and weight matrix factorization, we are able to significantly reduce model size and computation for small-footprint keyword spotting, while maintaining performance.

TL;DR: We investigate quantization-aware training in very low-bit quantized keyword spotters to reduce the cost of on-device keyword spotting.

Keywords: keyword spotting, quantization-aware training, small-footprint

6 Replies

Loading