Keywords: deep spiking neural networks, surrogate gradients, adaptive, asymmetric
Abstract: Training deep spiking neural networks (SNNs) remains challenging due to sharp loss landscapes and temporal inconsistency caused by surrogate gradients.
To address these challenges, we propose a unified framework: adaptive and asymmetric surrogate gradients ($\textit{A$^2$SG}$).
The adaptive gradients adjust an effective window for spatio-temporal adaptation, reducing spatial gradient variation and maintaining directional consistency of gradients over time.
The asymmetric gradients reflect neuronal dynamics by assigning larger gradients to neurons with higher membrane potentials, and we prove that they yield lower variation than symmetric surrogates.
Our analysis further establishes a direct connection between local gradient variation and the curvature of the loss landscape, providing a principled explanation for how $\textit{A$^2$SG}$ promotes convergence to flatter minima and improves generalization.
We conduct extensive experiments on diverse models, including CNN-based and Transformer-based SNNs, across various tasks such as image classification using both static and neuromorphic datasets, as well as segmentation.
The results demonstrate that $\textit{A$^2$SG}$ consistently improves accuracy and energy efficiency, establishing it as a general and reliable solution for training deep SNNs.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 8815
Loading