Keywords: Agent, Formal Theorem Proving, Automated Theorem Proving, Small Language Model
TL;DR: We present Prover Agent, an AI agent for automated theorem proving that integrates LLMs with Lean and auxiliary lemma generation, achieving 88.1% on MiniF2F, the new SOTA among methods using small language models.
Abstract: We present Prover Agent, a novel AI agent for automated theorem proving that integrates large language models (LLMs) with a formal proof assistant, Lean. Prover Agent coordinates an informal reasoning LLM, a formal prover model, and feedback from Lean while also generating auxiliary lemmas to assist in discovering the overall proof strategy. It achieves an 88.1% success rate on the MiniF2F benchmark, establishing a new state-of-the-art among methods using small language models (SLMs) with a much lower sample budget than previous approaches. We also present theoretical analyses and case studies that illustrate how these generated lemmas contribute to solving challenging problems.
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Submission Number: 18484
Loading