Topoformer: brain-like topographic organization in Transformer language models through spatial querying and reweighting

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: applications to neuroscience & cognitive science
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Transformer, Topographic organization, Cortex, Neuroscience, Language
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
TL;DR: We spatialize self-attention to organize transformer representations as brain-like topographic maps
Abstract: Spatial functional organization is a hallmark of biological brains: neurons are arranged topographically according to their response properties at different scales. In contrast, representations within most machine learning models lack spatial biases, and instead manifest as disorganized vector spaces that are difficult to visualize and interpret. Here, we propose a novel form of self-attention that turn Transformers into "Topoformers" with topographic organization. Our primary contribution is Spatial Querying, where keys and queries are arranged on 2D grids, and local pools of queries are associated with a given key. Our secondary contribution is Spatial Reweighting, where we convert the standard fully connected layer of self-attention into a locally connected layer. We first demonstrate the feasibility of our approach using by training a 1-layer Topoformer on a sentiment classification task. We show that training with Spatial Querying results in corresponding topographic organization between queries and keys, and Spatial Reweighting results in corresponding topographic organization between values and self-attention outputs. This emergent organization is \textit{semantically interpretable}: the internal activation magnitudes show spatial biases for sentences with positive and negative sentiment. Moreover, generic topographic organization is seen in the low dimensional structure of activations revealed through principal component analysis. After establishing that we can indeed obtain interpretable topography, we apply the Topoformer motifs at scale. We train the widely used BERT architecture on larger corpora with a masked language modeling objective. We find that the topographic variant of this model performs on par with a non-topographic control architecture on downstream NLP benchmarks. Finally, we analyze an fMRI dataset of human brain responses to a large set of naturalistic sentences, demonstrating that the Topoformer yields similar forms of topographic organization for linguistic information as that present in the language network of individual subjects. Scaling up Topoformers holds promise for greater interpretability in NLP research, and for more accurate models of the organization of linguistic and semantic information in the human brain.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 8052
Loading