InternLM2.5-StepProver: Advancing Automated Theorem Proving via Critic-Guided Search

Zijian Wu; Suozhi Huang; Zhejian Zhou; Huaiyuan Ying; Zheng Yuan; Wenwei Zhang; Dahua Lin; Kai Chen

InternLM2.5-StepProver: Advancing Automated Theorem Proving via Critic-Guided Search

Zijian Wu, Suozhi Huang, Zhejian Zhou, Huaiyuan Ying, Zheng Yuan, Wenwei Zhang, Dahua Lin, Kai Chen

Published: 09 Jul 2025, Last Modified: 16 Jul 2025AI4Math@ICML25 PosterEveryoneRevisionsBibTeXCC BY-NC-SA 4.0

Keywords: Machine Learning, Automated Theorem Proving, LEAN, Critic-Guided Search

Abstract: Large Language Models (LLMs) have emerged as powerful tools in mathematical theorem proving, particularly when utilizing formal languages such as LEAN. A prevalent proof method involves the LLM prover iteratively constructing the proof tactic by tactic, typically following a best-first search scheme. However, this method often ignores the critical preference information inside the existing tactic trajectories, hindering the search for deeper proofs. We propose an intuitive yet effective method, which utilizes a critic model to capture the preference information and to guide the search of the prover model at run-time. Given the prover-critic framework, a large-scale expert iteration with more than 20,000 CPU days is then applied to further fine-tune the prover and the critic. The trained critic model significantly boosts the performance of the prover model (59.4% to 65.9%). We also analyze the impact of the critic on various aspects of the theorem proving process during expert iteration, providing insights into its effectiveness. The models and the discovered proofs will be open-sourced.

Submission Number: 125

Loading