Clarifying Before Reasoning: A Coq Prover with Structural Context

Yanzhen Lu; Hanbin Yang; Xiaodie Wang; Ge Zhang; BIAO LI; Chenxu Fu; Chao Li; Yang Yuan; Andrew C Yao

Clarifying Before Reasoning: A Coq Prover with Structural Context

Yanzhen Lu, Hanbin Yang, Xiaodie Wang, Ge Zhang, BIAO LI, Chenxu Fu, Chao Li, Yang Yuan, Andrew C Yao

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: theorem proving, Coq, structured reasoning, formal verification

Abstract: In this work, we investigate whether improving task clarity can enhance reasoning ability of large language models, focusing on theorem proving in Coq. We introduce a concept-level metric to evaluate task clarity and show that adding structured semantic context to the standard input used by modern LLMs, leads to a 1.85$\times$ improvement in clarity score (44.5\%~$\rightarrow$~82.3\%). Using the general-purpose model DeepSeek-V3, our approach leads to a 2.1$\times$ improvement in proof success (21.8\%~$\rightarrow$~45.8\%) and outperforms the previous state-of-the-art Graph2Tac (33.2\%). We evaluate this on 1,386 theorems randomly sampled from 15 standard Coq packages, following the same evaluation protocol as Graph2Tac. Furthermore, fine-tuning smaller models on our structured data can achieve even higher performance (48.6\%). Our method uses selective concept unfolding to enrich task descriptions, and employs a Planner-Executor architecture. These findings highlight the value of structured task representations in bridging the gap between understanding and reasoning.

Primary Area: neurosymbolic & hybrid AI systems (physics-informed, logic & formal reasoning, etc.)

Submission Number: 10135

Loading