Representations and Computations in Transformers that Support Generalization on Structured Tasks
Abstract: Transformers have shown remarkable success in natural language processing and computer vision, serving as the foundation of large language and multimodal models. These networks can capture nuanced context sensitivity across high-dimensional language tokens or image pixels, but it remains unclear how highly structured behavior and systematic generalization can arise in these systems. Here, we explore the solution process a causal transformer discovers as it learns to solve a set of algorithmic tasks involving copying, sorting, and hierarchical compositions of these operations. We search for the minimal layer and head configuration sufficient to solve these tasks and unpack the roles of the attention heads, as well as how token representations are reweighted across layers to complement these roles. Our results provide new insights into how attention layers in transformers support structured computation within and across tasks: 1) Replacing fixed position labels with labels sampled from a larger set enables strong length generalization and faster learning. The learnable embeddings of these labels develop different representations, capturing sequence order if necessary, depending on task demand. 2) Two-layer transformers can learn reliable solutions to the multi-level problems we explore. The first layer tends to transform the input representation to allow the second layer to share computation across repeated components within a task or across related tasks. 3) We introduce an analysis pipeline that quantifies how the representation space in a given layer prioritizes different aspects of each item. We show that these representations prioritize information needed to guide attention relative to information that only requires downstream readout.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: ===Aug 29=== Upload camera ready version. ===Jul 25=== We have updated the paper to provide further clarifications and to incorporate response to reviewers' comments. Specifically: **Major change 1**: We revised the introduction of the label-based encoding method (on page 3) to provide further clarification, and expanded the discussion of the implications of this method (on page 11) to incorporate the relevant points raised in reviewers' comments. **Major change 2**: We expanded the discussion on the broader implications of our work on more complex task and model settings (see the end of page 11), including further thoughts around the SVD-based representation analysis method and its implication for future work on mechanistic understanding (see page 12 in Discussion, also see one additional paragraph in Related Work). **Minor changes**: We added minor changes in the main text and in the figures to address points of confusion raised in the reviews.
Assigned Action Editor: ~Stefan_Lee1
Submission Number: 1286