Keywords: state space models, event based vision, Adaptive Memory, sequence modeling
TL;DR: We introduce FLAMES, a novel spiking neural network architecture that effectively captures long-range temporal dependencies through Spike-Aware HiPPO and dendrite attention.
Abstract: We propose Fast Long-range Adaptive Memory for Event (FLAME), a novel scalable architecture that combines neuro-inspired feature extraction with robust structured sequence modeling
to efficiently process asynchronous and sparse event camera data. As a departure from conventional input encoding methods, FLAME presents Event Attention Layer, a novel feature extractor that leverages neuromorphic dynamics (Leaky Integrate-and-Fire (LIF)) to directly capture multi-timescale features from event streams. The feature extractor is integrates with a structured state-space model with a novel Event-Aware HiPPO (EA-HiPPO) mechanism that dynamically adapts memory retention based on inter-event intervals to understand relationship across varying temporal scales and event sequences. A Normal Plus Low Rank (NPLR) decomposition reduces the computational complexity of state update from $\mathcal{O}(N^2)$ to $\mathcal{O}(Nr)$, where $N$ represents the dimension of the core state vector and $r$ is the rank of a low-rank component (with $r \ll N$). FLAME demonstrates state-of-the-art accuracy for event-by-event processing on complex event camera datasets.
Supplementary Material: zip
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 16376
Loading