Universality, Composition Generalization, and Algorithm Emulation All In-Context

Published: 25 May 2026, Last Modified: 25 May 2026CTB@ICML 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: in-context learning, in-context algorithm emulation, composition generalization, softmax attention, transformer
TL;DR: A constructive theory of in-context universal approximation and function composition by softmax Transformers
Abstract: We study the in-context universal approximation and compositional generalization of softmax Transformers. We prove an in-context universality result: a fixed-weight softmax Transformer approximates a broad class of continuous sequence-to-sequence functions. Building on this universality, we establish a composition theorem: by concatenating prompts associated with simple ``subprograms,'' the same fixed Transformer executes their composition, and thereby synthesizes more complex programs on-the-fly. These results support a principled view of prompts as programs and fixed-weight Transformers as program interpreters. Moreover, we provide a concrete mechanism by which GPT-style models both execute and assemble algorithms in context.
Paper Type: Long (8 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 6
Loading