Thinking Like TransformersDownload PDF

28 Sept 2020 (modified: 22 Oct 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: transformers, interpretability
Abstract: What is the computational model behind a transformer? Where recurrent neural networks have direct parallels in finite state machines, allowing clear discussion and thought around architecture variants or trained models, transformers have no such familiar parallel. In this paper we aim to change that, proposing a computational model for the transformer-encoder in the form of a programming language. We map the basic components of a transformer-encoder – attention and feed-forward computation – into the simple primitives of select, aggregate, and zipmap, around which we form a programming language: the Restricted Access Sequence Process-ing Language (RASP). We show how RASP can be used to program solutions to tasks that could conceivably be learned by a transformer, augmenting it with tools we discover in our work. In particular, we provide RASP programs for histograms, sorting, and even logical inference similar to that of Clark et al. (2020). We further use our model to relate their difficulty in terms of the number of required layers and attention heads. Finally, we see how insights gained from our abstraction might be used to explain phenomena seen in recent works.
One-sentence Summary: A proposed computational model for reasoning about transformers, in the form of a programming language, presented with example programs and revisiting of existing works.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Supplementary Material: zip
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](
Reviewed Version (pdf):
17 Replies