Format-Adapter: Optimize Training-Free Test-Time Scaling by Adapting Suitable Format

ACL ARR 2026 January Submission1524 Authors

30 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Test-Time Scaling, Reasoning Format
Abstract: Test-time scaling (TTS) has emerged as a practical way to improve LLM reasoning by allocating more inference-time computation, especially through parallel scaling. However, most existing TTS pipelines scale compute within fixed reasoning formats, limiting performance. In this work, we propose that TTS should scale not only the number of samples, but also the reasoning formats. We first introduce an error measurement for multi-sample reasoning that enables comparing TTS behaviors across formats. Based on this, we propose Format-Adapter, a training-free, format-adaptive TTS method that automatically generates candidate reasoning formats and selects the most suitable ones under a test-time budget by minimizing the proposed error measurement. We evaluate Format-Adapter on mathematics and general reasoning benchmarks. Under the same TTS budget, Format-Adapter yields consistent gains over existing baselines, achieving an average relative improvement of 2.2% and establishing a new state of the art (SOTA) for training-free parallel-scaling TTS.
Paper Type: Long
Research Area: Natural Language Generation
Research Area Keywords: scaling, prompting
Contribution Types: NLP engineering experiment
Languages Studied: English
Submission Number: 1524
Loading