Keywords: Language Model, Automatic Annotations, Mathematical Definitions and Theorems, RLHF
Abstract: Annotating mathematical knowledge is a fundamental prerequisite for structured knowledge acquisition from large-scale mathematical solutions. However, existing datasets lack the fine-grained annotations of theorems and definitions at scale. This makes it challenging to assess how well modern mathematical reasoning models understand and apply specific knowledge within their complex chain-of-thought (CoT) outputs.
In this paper, we propose a new task: Automatic Annotation of Mathematical Definitions and Theorems (AAMDT), which aims to extract Mathematical Definitions and Theorems (MDTs) from long CoT reasoning. To tackle this, we introduce ReasonMap, a novel two-stage training framework. The first stage, Foundational Model Training, builds a broadly capable model by performing Supervised Fine-Tuning on a hybrid corpus of both concise human annotations and LLM-augmented long CoT data. The second stage, High-Fidelity Alignment, then refines this model using Direct Preference Optimization to ensure the final output is both precise and reliable.
Comprehensive experiments show ReasonMap consistently outperforms strong baselines on the AAMDT task, especially in long CoT scenarios. Crucially, we validate the quality of our annotations by demonstrating that the extracted MDTs can directly enhance the mathematical reasoning performance of downstream large language models. Our work offers a scalable solution for the automatic annotation of mathematical corpora, significantly reducing the reliance on manual labeling. The
github repository can be found at: \url{https://anonymous.4open.science/r/ReasonMap-6A83}.
Primary Area: generative models
Submission Number: 8410
Loading