Transformer as a hippocampal memory consolidation model based on NMDAR-inspired nonlinearity

Published: 21 Sept 2023, Last Modified: 15 Jan 2024NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: Transformer, NMDA, long-term memory, reference memory, memory consolidation
Abstract: The hippocampus plays a critical role in learning, memory, and spatial representation, processes that depend on the NMDA receptor (NMDAR). Inspired by recent findings that compare deep learning models to the hippocampus, we propose a new nonlinear activation function that mimics NMDAR dynamics. NMDAR-like nonlinearity shifts short-term working memory into long-term reference memory in transformers, thus enhancing a process that is similar to memory consolidation in the mammalian brain. We design a navigation task assessing these two memory functions and show that manipulating the activation function (i.e., mimicking the Mg$^{2+}$-gating of NMDAR) disrupts long-term memory processes. Our experiments suggest that place cell-like functions and reference memory reside in the feed-forward network layer of transformers and that nonlinearity drives these processes. We discuss the role of NMDAR-like nonlinearity in establishing this striking resemblance between transformer architecture and hippocampal spatial representation.
Supplementary Material: zip
Submission Number: 7337