Learning Modular Exponentiation with Transformers

Published: 17 Oct 2025, Last Modified: 21 Nov 2025MATH-AI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Transformer, modular exponentiation, grokking, circuits
TL;DR: We train small Transformers to compute modular exponentiation and analyze how they learn, and find that shows sudden generalization across related moduli.
Abstract: Modular exponentiation ($a^b \equiv d \bmod c$) is crucial to number theory and cryptography, yet remains largely unexplored from a mechanistic interpretability standpoint. We train compact 4‑layer encoder–decoder Transformers to predict $d$ and analyze how they come to solve the task. We compare principled sampling schemes for $(a,b,c,d)$, probe the learned token embeddings, and use causal interventions (activation patching) to localize the computation inside the network. Sampling $a$ and $b$ log‑uniformly (reciprocal sampling) removes severe output imbalance and yields large accuracy gains, with abrupt, synchronized jumps in accuracy that simultaneously cover families of related moduli (e.g., multiples of 23). Causal analysis shows that, on instances without reduction ($c > a^b$), a small circuit consisting only of final‑layer attention heads reproduces full‑model behavior, indicating functional specialization. These results suggest that Transformers can internalize modular arithmetic via compact, specialized circuits, and that data distribution strongly shapes both learning dynamics and generalization.
Submission Number: 61
Loading