Disagreement-Aware Repeated Sampling and Selective Rewriting for Complex Mathematical Reasoning

ACL ARR 2026 January Submission3427 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: math reasoning, large reasoning models, rewriting
Abstract: Large Reasoning Models (LRMs) have achieved remarkable performance in mathematical reasoning tasks, but they still struggle with hard samples. Existing test-time scaling methods, such as repeated sampling, self-correction, and tree search, can improve performance but are computationally inefficient and quickly plateau on challenging problems. Rewriting techniques, which reformulate problem expressions to enhance model understanding, have demonstrated effectiveness on challenging samples. However, they are often unnecessary or even harmful for easier samples. In this work, we propose a training-free framework that adaptively applies majority voting and rewriting based on model disagreement. The experiments conducted on seven mathematical benchmark datasets and three models show that our method improves accuracy on mathematical reasoning tasks by 3\%-7\%, while requiring fewer samples than existing methods.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: mathematical NLP
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 3427
Loading