QDTSynth: Quality-Driven Formal Theorem Synthesis for Enhancing Proving Performance of LLMs

QDTSynth: Quality-Driven Formal Theorem Synthesis for Enhancing Proving Performance of LLMs

ACL ARR 2025 February Submission7883 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Automated Theorem Proving is an important and challenging task. Although large language models (LLMs) have demonstrated remarkable potential in mathematical reasoning, their performance in formal theorem proving remains constrained by the scarcity of high-quality supervised fine-tuning (SFT) data. To address this limitation, we propose a $\textbf{Q}$uality-$\textbf{D}$riven $\textbf{T}$heorem $\textbf{S}$ynthesis method (QDTSynth) in Lean4. During the statement synthesis, we enhance Monte Carlo Tree Search (MCTS) with an adaptive adjustment mechanism that dynamically optimizes the search strategy based on the synthesis of statements. In addition, we propose diversity screening and the self-assessment method to select theorems that exhibit both diversity and high quality from the initially synthetic statements, enabling the synthesis of a high-quality Lean4 theorem dataset. After fine-tuning three open-source large language models on our synthetic dataset, experiments on the miniF2F benchmark demonstrate that QDTSynth significantly improves the performance of various open-source LLMs in theorem proving tasks. Our work offers a promising new direction for the future synthesis of high-quality formal mathematical theorems.

Paper Type: Long

Research Area: Machine Learning for NLP

Research Area Keywords: data augmentation, reinforcement learning

Contribution Types: Model analysis & interpretability

Languages Studied: Lean4

Submission Number: 7883

Loading