BrainGPT: A Brain-Inspired SNN-Based Large Language Model

Tang Zhengzheng; Eva Zhu

BrainGPT: A Brain-Inspired SNN-Based Large Language Model

Tang Zhengzheng, Eva Zhu

13 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Spiking Neural Networks, Large Language Models, Spike-Timing-Dependent Plasticity, Neuromorphic Computing, ANN-to-SNN Conversion

TL;DR: BrainGPT: Energy-efficient SNN-based LLM with 100% ANN-equivalent performance and biological plausibility.

Abstract: Large language models (LLMs) based on artificial neural networks (ANNs) have demonstrated remarkable performance but face challenges in computational efficiency and biological interpretability. We propose BrainGPT, a novel LLM architecture based on the Test-Time Training (TTT) framework and inspired by spiking neural networks (SNNs) and neurobiological principles. Our approach incorporates a dual-model structure, emulating the hierarchical language processing observed in the human brain, and utilizes a specialized integrate-and-fire neuron model with adaptive thresholding. Through a multi-stage training strategy, including quantization-aware pre-training, ANN-to-SNN conversion, and biologically inspired unsupervised learning, we achieve a mathematically proven lossless conversion from ANN to SNN, preserving 100\% of the original ANN model's performance. Moreover, the biologically inspired unsupervised learning optimizes the maximum time steps required to maintain 100\% ANN performance. Compared to the original TTT model, BrainGPT achieves a 33.4\% increase in energy efficiency and demonstrates a 66.7\% improvement in training convergence speed. This work advances the development of energy-efficient and biologically interpretable large language models that match the performance of state-of-the-art ANN-based models while significantly improving upon the TTT framework.

Primary Area: foundation or frontier models, including LLMs

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 140

Loading