Universality, Composition Generalization, and Algorithm Emulation All In-Context

Published: 26 May 2026, Last Modified: 26 May 2026ICML 2026 FoGen Workshop PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: in-context learning, in-context algorithm emulation, composition generalization, softmax attention, transformer
TL;DR: A constructive theory of in-context universal approximation and function composition by softmax Transformers
Abstract: We study the in-context universal approximation and compositional generalization of softmax Transformers. We prove an in-context universality result: a fixed-weight softmax Transformer approximates a broad class of continuous sequence-to-sequence functions. Building on this universality, we establish a composition theorem: by concatenating prompts associated with simple ``subprograms,'' the same fixed Transformer executes their composition, and thereby synthesizes more complex programs on-the-fly. These results support a principled view of prompts as programs and fixed-weight Transformers as program interpreters. Moreover, we provide a concrete mechanism by which GPT-style models both execute and assemble algorithms in context.
Submission Number: 1
Loading