ReasonMap: Fine-Grained Annotation of Mathematical Definition and Theorems in Long CoT Reasoning

Haodong Wu; Yuhang He; Yongqi Zhang

ReasonMap: Fine-Grained Annotation of Mathematical Definition and Theorems in Long CoT Reasoning

Haodong Wu, Yuhang He, Yongqi Zhang

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Language Model, Automatic Annotations, Mathematical Definitions and Theorems, RLHF

Abstract: Annotating mathematical knowledge is a fundamental prerequisite for structured knowledge acquisition from large-scale mathematical solutions. However, existing datasets lack the fine-grained annotations of theorems and definitions at scale. This makes it challenging to assess how well modern mathematical reasoning models understand and apply specific knowledge within their complex chain-of-thought (CoT) outputs. In this paper, we propose a new task: Automatic Annotation of Mathematical Definitions and Theorems (AAMDT), which aims to extract Mathematical Definitions and Theorems (MDTs) from long CoT reasoning. To tackle this, we introduce ReasonMap, a novel two-stage training framework. The first stage, Foundational Model Training, builds a broadly capable model by performing Supervised Fine-Tuning on a hybrid corpus of both concise human annotations and LLM-augmented long CoT data. The second stage, High-Fidelity Alignment, then refines this model using Direct Preference Optimization to ensure the final output is both precise and reliable. Comprehensive experiments show ReasonMap consistently outperforms strong baselines on the AAMDT task, especially in long CoT scenarios. Crucially, we validate the quality of our annotations by demonstrating that the extracted MDTs can directly enhance the mathematical reasoning performance of downstream large language models. Our work offers a scalable solution for the automatic annotation of mathematical corpora, significantly reducing the reliance on manual labeling. The github repository can be found at: \url{https://anonymous.4open.science/r/ReasonMap-6A83}.

Primary Area: generative models

Submission Number: 8410

Loading