Keywords: math reasoning, large reasoning models, rewriting
Abstract: Large Reasoning Models (LRMs) have achieved remarkable performance in mathematical reasoning tasks, but they still struggle with hard samples. Existing test-time scaling methods, such as repeated sampling, self-correction, and tree search, can improve performance but are computationally inefficient and quickly plateau on challenging problems. Rewriting techniques, which reformulate problem expressions to enhance model understanding, have demonstrated effectiveness on challenging samples. However, they are often unnecessary or even harmful for easier samples. In this work, we propose a training-free framework that adaptively applies majority voting and rewriting based on model disagreement. The experiments conducted on seven mathematical benchmark datasets and three models show that our method improves accuracy on mathematical reasoning tasks by 3\%-7\%, while requiring fewer samples than existing methods.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: mathematical NLP
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 3427
Loading