No LLM Solved Yu Tsumura's 554th Problem

No LLM Solved Yu Tsumura's 554th Problem

ICLR 2026 Conference Submission19861 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLMs, mathematics, reasoning, evaluation

TL;DR: We show, contrary to the optimism about LLM's problem-solving abilities, that comparatively simple problems can exist that no LLM solves

Abstract: We show, contrary to the optimism about LLM's problem-solving abilities, fueled by the recent gold medals at the International Math Olympiad (IMO) that LLMs attained, that a problem exists---Yu Tsumura's 554th problem---that a) is within the scope of an IMO problem in terms of proof sophistication, b) is not a combinatorics problem, which have caused issues for LLMs, c) requires fewer proof techniques than typical hard IMO problems, d) has a publicly available solution (likely in the training data of LLMs), and e) that cannot be readily solved by \emph{any} existing off-the-shelf LLM (commercial or open-source). We include an analysis of the output traces of 16 SOTA LLMs. Additionally, we compare the generic LLM output to a new proof by a former IMO participant, carried out in a small study, which is significantly better motivated than the original, publicly-available proof, and elaborate on the differences in LLM and human proof quality.

Primary Area: datasets and benchmarks

Submission Number: 19861

Loading