Abstract: Compared to the conventional deep neural network (DNN), the binary activation (spikes) of spiking neural network (SNN) largely improves energy efficiency, especially with spatial-temporal computer vision tasks. Despite the memory reduction introduced by the binary spikes, the full precision membrane potential of SNNs requires the same-sized memory array as the full precision activation of DNNs. In other words, the binary spikes of SNNs elevate the efficiency of layer-wise communication but do not necessarily improve the total storage or overall memory efficiency. Prior research works have not fully investigated low-precision membrane potential considering the overall SNN memory efficiency and hardware awareness. Motivated by that, this paper proposes IM-SNN, an integer-only SNN with low-precision weights and membrane potential. Unlike prior works that ignore the hardware bottleneck of iterative membrane updates, IM-SNN provides a systematic compression scheme that makes the quantized SNN fully deployable to the hardware accelerator. In addition to the low-precision weights, IM-SNN compresses the membrane potential down to ternary precision, leading to outstanding hardware compatibility and memory efficiency. The proposed method has been validated against both static image datasets (CIFAR, ImageNet) and event-based datasets (DVS-CIFAR10, N-Caltech, Prophesee Automotive Gen1) for both classification and object detection tasks. Compared to the state-of-the-art (SoTA) SNN works, our work achieves up to $13\times$ memory reduction with negligible accuracy degradation.
Loading