Keywords: Autoformalization, Formal Mathematics, AI for Math, Neurosymbolic AI, LLM, Large Language Models, Formal Theorem Proving, Neural Theorem Proving
TL;DR: We designed a framework that not only improves LLMs at formalizing a corpus of statements, but also learns a library of reusable abstractions, which scales to statements outside of the corpus.
Abstract: Existing approaches to autoformalization---the task of translating informal mathematics into formal machine-verifiable languages---rely heavily on pre-defined libraries and expect LLMs to directly generate complete formalizations. These approaches face three fundamental limitations: they are bottlenecked by existing abstractions, they have difficulty handling the complexity of realistic statements, and they do not transfer well across formal languages.
We propose $\textit{Divide and Abstract (DNA)}$, a zero-training framework that addresses these challenges through a two-phase approach.
First, $\textit{DNA}$ extracts common mathematical concepts from the entire corpus and formalizes them as reusable abstractions, extending the target language's capability.
Second, $\textit{DNA}$ hierarchically decomposes new statements into structured informal clauses, translates each clause using the learned abstractions, and composes them into complete formalizations.
Our evaluation on the LeanEuclidPlus and ProofNet-Hard benchmarks demonstrates consistent improvements across multiple model families, achieving up to $\textbf{8.6}\times$ performance gains over baselines.
Notably, $\textit{DNA}$ enables smaller models to match baselines using much larger models, and shows particularly strong performance on complex mathematical statements requiring nested reasoning. Furthermore, our framework requires no training on target languages, making it effective for low-resource domain-specific languages. Our code is available at https://github.com/marcusm117/DNA.
Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)
Submission Number: 10057
Loading