F1-Reasoner: Synthesizing Verifiable Reasoning Data From Formal Math Statements

Zihao Zhou; Wei Liu; Xinlong Fu; Kaizhu Huang; Xiaowei Huang; Meng Fang; Wenda Li; Qiufeng Wang

F1-Reasoner: Synthesizing Verifiable Reasoning Data From Formal Math Statements

Zihao Zhou, Wei Liu, Xinlong Fu, Kaizhu Huang, Xiaowei Huang, Meng Fang, Wenda Li, Qiufeng Wang

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large Language Models, Large Reasoning Models, Mathematical Reasoning, Synthetic Data, RLVR

Abstract: Recent progress in reinforcement learning with verifiable rewards (RLVR) has substantially advanced the mathematical reasoning ability of large reasoning models (LRMs). However, existing datasets either rely heavily on manual annotation or are synthesized within artificial environments such as logic games. In this work, We propose a data synthesis framework that transforms formal mathematical statements into high-quality verifiable reasoning data. It first performs Statement Collection and Quality Control to obtain high-quality proven statements, then applies Problem Generation to convert them into verifiable math solving problems, and finally leverages RLVR with a verifier for Model Training. Using this framework, we synthesize 19k high-quality mathematical problems at levels 5–10 and train the F1-Reasoner series of models. Across six challenging benchmarks, F1-Reasoner consistently improves upon 3 different open-weight models across different sizes, outperforming models such as SynLogic and Absolute-Zero that are trained on verifiable data from other environments. Moreover, we mix our data with MATH to create F1-Reasoner-Mix, which further boosts performance; notably, F1-Reasoner-Mix-8B surpasses General-Reasoner-14B while using substantially less data. Further analysis shows that F1-Reasoner generalizes to informal theorem proving and exhibits richer thinking behaviors.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 9641

Loading