Enumerate-Conjecture-Prove: Formally Solving Answer-Construction Problems in Math Competitions

Enumerate-Conjecture-Prove: Formally Solving Answer-Construction Problems in Math Competitions

ICLR 2026 Conference Submission13730 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Automated Reasoning, Theorem Proving, Autoformalization, AI for Math, LLM

TL;DR: We introduce ECP, a modular neuro-symbolic pipeline that combines LLM-driven enumeration and conjecturing with theorem provers to tackle constructive olympiad-level math problems in Lean.

Abstract: Mathematical reasoning is central to artificial intelligence, with applications in education, code generation, and research-level mathematical discovery. Mathematical competitions highlight two problem types: theorem-proving, requiring rigorous proofs, and answer-construction, requiring creative generation and formal verification of mathematical objects. Existing research reveals that LLMs can tackle difficult answer-construction tasks but are prone to errors from hallucinations and unverifiable steps, while symbolic methods guarantee rigor but falter in creative answer construction. This raises a key understudied question: how to solve answer-construction problems while preserving both LLM creativity and mathematical rigor? To address this problem, we introduce the Enumerate–Conjecture–Prove (ECP) framework, a modular neuro-symbolic method integrating LLM-based enumeration and pattern-driven conjecturing with formal theorem proving in Lean, and ConstructiveBench, a dataset of 3,640 formal answer-construction problems from math competitions. ECP is model-agnostic and shows consistent improvements over pure LLM baselines: on the subset of PutnamBench for answer construction, ECP formally solves 6 out of 337 answer-construction problems end-to-end (up from 4 without ECP) with GPT-5 mini and DeepSeek-Prover-V2-7B. On ConstructiveBench, ECP achieves 33.1\% end-to-end state-of-the-art accuracy (up from 32.5\%), demonstrating its potential to advance formal mathematical reasoning by combining LLM conjecturing with formal verification.

Supplementary Material: zip

Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)

Submission Number: 13730

Loading