The IMO Small Challenge: Not-Too-Hard Olympiad Math Datasets for LLMs

Published: 19 Mar 2024, Last Modified: 25 Jun 2024Tiny Papers @ ICLR 2024 PresentEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Olympiad math problems, LLMs, problem-solving, datasets
TL;DR: We introduce a dataset of hard (but not too hard) IMO-level problems.
Abstract: We introduce the IMO Small Challenge (IMOSC), as opposed to the IMO Grand Challenge: A text-only, natural-language dataset consisting of mathematical problems from various mathematical competitions. The IMOSC dataset exceeds the difficulty level of current datasets that are widely used for LLM evaluation, such as the MATH dataset, while not being too challenging for the current generation of LLMs. The IMOSC currently contains a carefully curated collection of the easiest possible problems from difficult competitions, such as the International Mathematical Olympiad (IMO). Problem hardness is measured by applying a mixture of (objective and subjective) difficulty filters to the original problems. We release the full dataset under the link below to encourage transparent evaluation of LLMs and theorem provers toward their mathematical proof-generating abilities: www.imo-small-challenge.io
Submission Number: 230
Loading