Minutes  to Converage: Dataset Distillation for Rapid SNN Training on Event Streams

Shuhan Ye; Yi Yu; Qixin Zhang; Chenqi Kong; Qiangqiang Wu; Kun Wang; Dacheng Tao; Xudong Jiang

Minutes to Converage: Dataset Distillation for Rapid SNN Training on Event Streams

Shuhan Ye, Yi Yu, Qixin Zhang, Chenqi Kong, Qiangqiang Wu, Kun Wang, Dacheng Tao, Xudong Jiang

05 Sept 2025 (modified: 15 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Dynamic Vision Sensors; Dataset Distillation; Spiking Neural Networks

Abstract: Event cameras emit sparse, polarity-signed streams that align with how spiking neural networks compute in time, yet image-centric dataset distillation trans- fers poorly to this regime. We present PACE (Phase-Aligned Condensation for Events), the first event-native dataset distillation framework for SNNs, which com- prises two core modules: ST-DSM and PEQ-N. ST-DSM densifies spikes with residual membrane potential and aligns real and synthetic streams by matching amplitude and phase using a characteristic-function projection in feature space and a discrete Fourier transform along time. PEQ-N is a probabilistic quantizer whose forward pass emits hard integer frames while a straight-through estimator preserves gradients and keeps compatibility with standard event-frame pipelines. We optimize only the synthetic data with a time-expanded condensation objec- tive on frozen teacher features, which encourages causal spatiotemporal structure and shortens convergence time. On DVS-Gestures with IPC=10 at 9.29% of the data, PACE reaches 76.5%, about 89% of full-data performance and +20.4 points over a strong baseline. Similar gains appear on CIFAR10-DVS and N-MNIST and transfer across SNN backbones. PACE delivers compact, accurate surrogates that reduce storage and wall-clock time and make minutes-to-converge training practi- cal on neuromorphic streams while opening a path to efficient on-device learning and reproducible distilled benchmarks.

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 2427

Loading