The Expressivity of Fixed-Precision Transformers without Positional Encoding

24 Jan 2025 (modified: 18 Jun 2025)Submitted to ICML 2025EveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We theoretically demonstrated practical Transformer models can only recognize almost finite language.
Abstract: The primary objective of this study is to examine how practical constraints impact the expressivity of Transformers and to investigate their expressivity in real-world implementations. To achieve this, we analyze the expressivity of Transformer decoders operating under fixed-precision float arithmetic, an assumption regarding query-key parameters, and the presence or absence of positional encoding. Our findings reveal that, under fixed-precision and these constraints, Transformers are limited to recognizing finite or co-finite languages, a proper subclass of regular languages. While incorporating positional encoding or relaxing certain assumptions marginally enhances expressivity, the fundamental limitations imposed by fixed precision remain significant. These results underscore the gap between theoretical models and real-world implementations, suggesting that practical Transformers may be fundamentally constrained to recognizing only finite and co-finite languages, effectively functioning as little more than efficient lookup tables.
Primary Area: Deep Learning->Theory
Keywords: Transformer, Formal Language, Expressivity
Submission Number: 16326
Loading