MMDEND: Dendrite-Inspired Multi-Branch Multi-Compartment Parallel Spiking Neuron for Sequence Modeling

ACL ARR 2024 December Submission1030 Authors

15 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Vanilla spiking neurons are simplified from complex biological neurons with dendrites, soma, and synapses, into single somatic compartments. Due to limitations in performance and training efficiency, they face significant challenges in modeling long sequences. In terms of performance, the oversimplified dynamics of spiking neurons omit long-term temporal dependencies, and the long-tail membrane potential distribution along with binary activation discretization errors also limit their capacity to model long sequences. In terms of efficiency, the serial mechanism of spiking neurons lead to excessively long training times for long sequence. Though parallel spiking neurons are an efficient solution, the number of parameters of them is often tied to the hidden dimension or sequence length, which makes current parallel neurons unsuitable for large architectures. To address these issues, we propose MMDEND: a Multi-Branch Multi-Compartment Parallel Spiking Dendritic Neuron. Its proportion-adjustable multi-branch, multi-compartment structure enables long-term dependent temporal dynamics. Additionally, we introduce a Scaling-Shifting Integer Firing (SSF) mechanism which fits the long-tail membrane potential distribution and retains efficiency while mitigating discretization errors. Compared with parallel neurons, MMDEND achieves better long sequence modeling capability with fewer parameters and lower energy consumption. Visualization also confirms that the SSF mechanism effectively fits long-tail distributions.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: application; language model; spiking neurons
Languages Studied: English
Submission Number: 1030
Loading