Probing the Limits of Mathematical World Models in LLMs

Henry Kvinge; Elizabeth Coda; Eric Yeats; Davis Brown; John Buckheit; Sarah McGuire Scullen; Brendan Kennedy; Loc Truong; William Kay; Cliff Joslyn; Tegan Emerson; Michael J. Henry; John Anthony Emanuello

Probing the Limits of Mathematical World Models in LLMs

Henry Kvinge, Elizabeth Coda, Eric Yeats, Davis Brown, John Buckheit, Sarah McGuire Scullen, Brendan Kennedy, Loc Truong, William Kay, Cliff Joslyn, Tegan Emerson, Michael J. Henry, John Anthony Emanuello

Published: 10 Jun 2025, Last Modified: 14 Jul 2025ICML 2025 World Models WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Mathematical world models, Linear probing, representation geometry and topology

TL;DR: We investigate whether the mathematical world models of LLMs align with structures and properties from mathematics broadly

Abstract: There are now many studies supporting the idea that even when they are trained on a broad corpus of textual data scraped from the internet, large language models (LLMs) are (sporadically) capable of non-trivial mathematical tasks. This observation and a collection of studies from the interpretability community together suggest that LLMs extract surprisingly rich internal representations of mathematical objects. In this paper we ask the extent to which LLMs contain mathematical 'world models' that align with the way that mathematicians understand and think about mathematics. We focus on simple binary operations $\star: X \times X \rightarrow X$ like addition and multiplication which take two inputs $a$ and $b$ from a space $X$ and produce a third element $a \star b = c$. Instead of assessing the correctness of the LLM response, we explore the extent to which the model captures the geometric structure of $X$, simple number-theoretic properties of $a$ and $b$, and the algebraic properties of $\star$. We report mixed results. While the LLMs we tested tended to store substantial amounts of information (such as the divisibility properties of integers $a$ and $b$ in the expression $a \times b$) and sometimes extracted representations that aligned with existing mathematical structures (reconstructing a patch of $\mathbb{R}^2$ for example), these representations tended to be local in nature and lack robustness.

Submission Number: 7

Loading