Machine Learning meets Algebraic Combinatorics: A Suite of Datasets to Accelerate AI for Mathematics Research

Herman Chau; Helen Jenne; Davis Brown; Jesse He; Mark Raugas; Sara C. Billey; Henry Kvinge

Machine Learning meets Algebraic Combinatorics: A Suite of Datasets to Accelerate AI for Mathematics Research

Herman Chau, Helen Jenne, Davis Brown, Jesse He, Mark Raugas, Sara C. Billey, Henry Kvinge

Published: 10 Oct 2024, Last Modified: 31 Oct 2024MATH-AI 24EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Algebraic combinatorics, AI for Math, Datasets

TL;DR: We introduce a collection of datasets for machine learning that target both open and foundational results in algebraic combinatorics

Abstract: The use of benchmark datasets has become an important engine of progress in machine learning (ML) over the past 15 years. Recently there has been growing interest in utilizing machine learning to drive advances in research-level mathematics. However, off-the-shelf solutions often fail to deliver the types of insights required by mathematicians. This suggests the need for new ML methods specifically designed with mathematics in mind. The question then is: what benchmarks should the community use to evaluate these? On the one hand, toy problems such as learning the multiplicative structure of small finite groups have become popular in the mechanistic interpretability community whose perspective on explainability aligns well with the needs of mathematicians. While toy datasets are a useful to guide initial work, they lack the scale, complexity, and sophistication of many of the principal objects of study in modern mathematics. To address this, we introduce a new collection of datasets, the Algebraic Combinatorics Dataset Repository (ACD Repo), representing either classic or open problems in algebraic combinatorics, a subfield of mathematics that studies discrete structures arising from abstract algebra. After describing the datasets, we discuss the challenges involved in constructing``good'' mathematics dataset for ML and describe baseline model performance.

Concurrent Submissions: This work is currrently under review in the benchmarks and datasets track at NeurIPS

Submission Number: 80

Loading