Keywords: Machine Learning, Automated Theorem Proving, LEAN, Critic-Guided Search
Abstract: Large Language Models (LLMs) have emerged as powerful tools in mathematical theorem proving, particularly when utilizing formal languages such as LEAN.
A prevalent proof method involves the LLM prover iteratively constructing the proof tactic by tactic, typically following a best-first search scheme.
However, this method often ignores the critical preference information inside the existing tactic trajectories, hindering the search for deeper proofs.
We propose an intuitive yet effective method, which utilizes a critic model to capture the preference information and to guide the search of the prover model at run-time.
Given the prover-critic framework, a large-scale expert iteration with more than 20,000 CPU days is then applied to further fine-tune the prover and the critic.
The trained critic model significantly boosts the performance of the prover model (59.4% to 65.9%).
We also analyze the impact of the critic on various aspects of the theorem proving process during expert iteration, providing insights into its effectiveness.
The models and the discovered proofs will be open-sourced.
Submission Number: 125
Loading