Keywords: spiking neural networks, pruning, quantization, optimization, second-order methods
TL;DR: This paper extends OBS style second-order pruning algorithm to spiking neural networks
Abstract: Spiking Neural Networks (SNNs) have emerged as a new generation of energy-efficient neural networks suitable for implementation on neuromorphic hardware. As neuromorphic hardware has limited memory and computing resources, weight pruning and quantization have recently been explored to improve SNNs' efficiency. State-of-the-art SNN pruning/quantization methods employ multiple compression and training iterations, increasing the cost for pre-trained or very large SNNs. In this paper, we propose a novel one-shot post-training compression framework, Spiking Brain Compression (SBC), that extends the classical Optimal Brain Surgeon (OBS) method to SNNs. SBC replaces the current-based objective found in common layer-wise compression method with a spike train-based objective whose Hessian is cheaply computable, allowing a single backward pass to compress synapses and analytically rescale the rest. Applying SBC to SNN pruning and quantization, Our experiments on models trained with neuromorphic datasets (N-MNIST, CIFAR10-DVS, DVS128-Gesture) and large static datasets (CIFAR-100, ImageNet) show state-of-the-art results for SNNs one-shot post-training compression methods, with single-digit to double-digit accuracy gains compared to ANN methods applied to SNNs. Combined with finetuning, SBC is also competitive with the accuracy of costly iterative methods, while cutting compression time by two orders of magnitude.
Supplementary Material: zip
Primary Area: optimization
Submission Number: 12576
Loading