Keywords: math, program synthesis, abstraction learning, program induction, rule learning, synthetic data
Abstract: Abstract Interpretation provides a framework for approximating the behavior of discrete systems by establishing a correspondence between concrete execution traces and abstract properties. We apply this framework to mathematics to address the
inverse problem: automatically synthesizing a general program (the abstraction)
from a single concrete example, which executes to produce specific, valid problem
instances (the concretization). Prior approaches to capturing this structure rely
on hand-crafted templates, a labor-intensive process that restricts the technique to
arithmetic word problems or small datasets. We introduce EFAGen, a method that
operationalizes this inference as a program synthesis task, generating Executable
Functional Abstractions (EFAs) that encode the parameters, constraints, and solution procedure of the seed problem. Because formal verification of synthesized
code is intractable, we filter candidates using executable unit tests that enforce
necessary properties. We demonstrate that these inferred abstractions enable data
augmentation that complements existing strong data mixes for math reasoning and
facilitate adversarial search to discover problem variants that models fail to solve.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 21279
Loading