A completely uniform transformer for parity

Published: 01 Jan 2025, Last Modified: 03 May 2025CoRR 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We construct a 3-layer constant-dimension transformer, recognizing the parity language, where neither parameter matrices nor the positional encoding depend on the input length. This improves upon a construction of Chiang and Cholak who use a positional encoding, depending on the input length (but their construction has 2 layers).
Loading