N Multipliers for N Bits: Learning Bit Multipliers for Non-Uniform Quantization

Raghav Singhal; Anmol Biswas; Sivakumar Elangovan; Shreyas Sabnis; Udayan Ganguly

N Multipliers for N Bits: Learning Bit Multipliers for Non-Uniform Quantization

Raghav Singhal, Anmol Biswas, Sivakumar Elangovan, Shreyas Sabnis, Udayan Ganguly

Published: 17 Oct 2024, Last Modified: 06 Dec 2024MLNCP PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Quantization, Energy Efficiency, Spiking Neural Networks

Abstract: Effective resource management is critical for deploying Deep Neural Networks (DNNs) in resource-constrained environments, highlighting the importance of low-bit quantization to optimize memory and speed. In this paper, we introduce N-Multipliers-for-N-Bits, a novel method for non-uniform quantization designed for efficient hardware implementation. Our method uses N parameters, distinct for every layer and corresponding to the N quantization bits, whose linear combinations span the set of allowed weights (and activations). Furthermore, we learn these parameters in conjunction with the weights, ensuring exceptional flexibility in the quantizer model with minimal hardware overhead. We validate our method on CIFAR-10 and ImageNet, achieving competitive results with 3- and 4-bit quantized models. We demonstrate strong performance on 4-bit quantized Spiking Neural Networks (SNNs), evaluated on the CIFAR10-DVS and N-Caltech 101 datasets. Further, we address the issue of stuck-at faults in hardware, and demonstrate robustness to up to 30\% faulty bits.

Submission Number: 21

Loading