Bootstrapping Hierarchical Autoregressive Formal Reasoner with Chain-of-Proxy-Autoformalization

Qi Liu; Xinhao Zheng; Renqiu Xia; Qinxiang Cao; Junchi Yan

Bootstrapping Hierarchical Autoregressive Formal Reasoner with Chain-of-Proxy-Autoformalization

Qi Liu, Xinhao Zheng, Renqiu Xia, Qinxiang Cao, Junchi Yan

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language model, math QA, formal verification, formal theorem proving, formal problem-solving

TL;DR: For deductive formal problem-solving (D-FPS), we propose a method and a data generation pipeline.

Abstract: Deductive formal problem-solving (D-FPS) enables process-verified, human-aligned problem-solving by implementing deductive solving processes within formal theorem proving (FTP) environments. However, current methods fail to address the misalignment between informal and formal reasoning granularity and suffer from inefficiency due to backtracking and error propagation. Moreover, the extreme scarcity of formal problem-solution pairs further hinders progress. For the first gap, we propose **HAR** (_**H**ierarchical **A**utoregressive Formal **R**easoner_), a novel reasoning pipeline. HAR decouples informal-aligned drafting and detailed proving, and formulates solution construction as autoregressive generation with per-step feedback. Second, we propose **CoPA** (_**C**hain-**o**f-**P**roxy-**A**utoformalization_), a data generation pipeline that cascades statement autoformalization, proof drafting, and proof search as a proxy autoformalization path. Experiments demonstrate significant improvements: trained on data bootstrapped by CoPA, HAR achieves superior performance on FormalMath500 ($15.50\\%\mapsto 44.09\\%$) and MiniF2F-Solving ($21.87\\%\mapsto 56.58\\%$) with lower computational budget. Explorations reveal promising directions in formal solution pruning and informal dataset denoising.

Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)

Submission Number: 2827

Loading