Rethinking the Spatiotemporal Distribution for High-Fidelity Parallel ANN-to-SNN Conversion

Rethinking the Spatiotemporal Distribution for High-Fidelity Parallel ANN-to-SNN Conversion

ICLR 2026 Conference Submission16950 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: spiking neural networks, SNN, ANN-to-SNN Conversion, Parallel Conversion

TL;DR: By correcting the spatial and temporal mismatches in neuronal activity during parallel ANN-to-SNN conversion, our method achieves state-of-the-art accuracy for ultra-low-latency SNN.

Abstract: Spiking Neural Networks (SNNs) have attracted increasing attention for their low power consumption and constant-time inference on neuromorphic hardware. Among existing approaches, ANN-to-SNN conversion is one of the most effective ways to obtain deep SNNs with accuracy comparable to traditional ANNs, and recent work has even extended it to \emph{parallel} conversion, where the full spike train is emitted in a single pass. Despite this promise, we find that ANN-to-SNN parallel conversion suffers from severe performance degradation at ultra-low timesteps ($T \leq 4$), limiting its practical use. In this work, we analyze the source of this performance gap and demonstrate that it originates from assumptions in the standard quantization–clip–floor–shift (QCFS) formulation, which, under the one-shot firing rule, introduces a step-dependent bias. To overcome this, we propose a \emph{distribution-aware parallel calibration} that corrects spatiotemporal mismatches while leaving the backbone and firing rule unchanged. Our method consists of two stages: (1) \textbf{spatial recalibration}, which adapts normalization layers to spike-domain statistics, and (2) \textbf{temporal correction}, which learns a per-channel, time-collapsed aggregated membrane potential bias to offset timestep-dependent errors. On ImageNet-1k, our approach boosts ResNet-18 top-1 accuracy from $\mathbf{25.20\%\!\to\!62.28\%}$ at $T=4$ and ResNet-34 from $\mathbf{50.67\%\!\to\!68.23\%}$ at $T=8$. These results demonstrate that revisiting—and correcting—standard QCFS premises in the \emph{parallel setting} is essential for accurate, low-latency SNNs without retraining the backbone.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 16950

Loading