The IMO Small Challenge: First IMO Dataset for LLMs

Published: 19 Mar 2024, Last Modified: 19 Mar 2024Tiny Papers @ ICLR 2024 PresentEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Olympiad math problems, LLMs, problem-solving, datasets
TL;DR: We introduce a dataset of hard (but not too hard) IMO-level problems.
Abstract: We introduce the IMO Small Challenge: A curated collection of the easiest possible IMO problems and other competitive mathematical problems. The goal is to bridge the existing gap in the range of available dataset difficulties in terms of testing problem-solving skills: Currently, datasets are predominantly either too easy (MATH or GSM8K), excessively challenging (solving arbitrary IMO problems, such as required by the IMO Grand Challenge, and embodied by the miniF2F dataset) or focus too little on problem-solving (\emph{GHOSTS}). Our challenge interpolates this difficulty range and serves as a test bench for next-generation language models. We release a preliminary version of a dataset that accompanies this challenge. It is grounded in natural language, and problems are annotated with solutions and other metadata, such as the type of proof strategy used, in order to facilitate semi-automatic evaluation of LLMs' outputs beyond classical correct-incorrect keyword matching.
Submission Number: 230
Loading