Efficient Surrogate Gradients for Training Spiking Neural Networks
Keywords: spiking neural network, surrogate gradient, adaptive adjustment, low overhead.
Abstract: Spiking Neural Network (SNN) is widely regarded as one of the next-generation neural network infrastructures, yet it suffers from an inherent non-differentiable problem that makes the traditional backpropagation (BP) method infeasible. Surrogate gradients (SG), which are an approximation to the shape of the Dirac's $\delta$-function, can help alleviate this issue to some extent. To our knowledge, the majority of research, however, keep a fixed surrogate gradient for all layers, ignorant of the fact that there exists a trade-off between the approximation to the delta function and the effective domain of gradients under the given dataset, hence limiting the efficiency of surrogate gradients and impairing the overall model performance. To guide the shape optimization in applying surrogate gradients for training SNN, we propose an indicator $k$, which represents the proportion of membrane potential with non-zero gradients in backpropagation. Further we present a novel $k$-based training pipeline that adaptively makes trade-offs between the surrogate gradients' shapes and its effective domain, followed by a series of ablation experiments for verification. Our algorithm achieves 68.93\% accuracy on the ImageNet dataset using SEW-ResNet34. Moreover, our method only requires extremely low external cost and can be simply integrated into the existing training procedure.
Submission Number: 65