BrainGPT: A Brain-Inspired SNN-Based Large Language Model

ICLR 2025 Conference Submission140 Authors

13 Sept 2024 (modified: 13 Oct 2024)ICLR 2025 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Spiking Neural Networks, Large Language Models, Spike-Timing-Dependent Plasticity, Neuromorphic Computing, ANN-to-SNN Conversion
TL;DR: BrainGPT: Energy-efficient SNN-based LLM with 100% ANN-equivalent performance and biological plausibility.
Abstract: Large language models (LLMs) based on artificial neural networks (ANNs) have demonstrated remarkable performance but face challenges in computational efficiency and biological interpretability. We propose BrainGPT, a novel LLM architecture based on the Test-Time Training (TTT) framework and inspired by spiking neural networks (SNNs) and neurobiological principles. Our approach incorporates a dual-model structure, emulating the hierarchical language processing observed in the human brain, and utilizes a specialized integrate-and-fire neuron model with adaptive thresholding. Through a multi-stage training strategy, including quantization-aware pre-training, ANN-to-SNN conversion, and biologically inspired unsupervised learning, we achieve a mathematically proven lossless conversion from ANN to SNN, preserving 100\% of the original ANN model's performance. Moreover, the biologically inspired unsupervised learning optimizes the maximum time steps required to maintain 100\% ANN performance. Compared to the original TTT model, BrainGPT achieves a 33.4\% increase in energy efficiency and demonstrates a 66.7\% improvement in training convergence speed. This work advances the development of energy-efficient and biologically interpretable large language models that match the performance of state-of-the-art ANN-based models while significantly improving upon the TTT framework.
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 140
Loading