Adaptive Test-Time Compute Allocation via Training-Free Difficulty Proxies

ICLR 2026 Conference Submission12848 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Test-Time Compute; LLMs; Adaptive Reasoning; Inference-Time Compute; Adaptive Allocation
TL;DR: We propose a framework for adaptive test-time compute allocation that leverages training-free difficulty proxies and requires no model fine-tuning.
Abstract: Large language models (LLMs) excel at complex tasks but incur prohibitive computational costs, particularly when using techniques like self-consistency that require multiple generation attempts. This paper addresses the challenge of adaptive test-time compute allocation. We propose a framework that leverages **training-free difficulty proxies** derived directly from the LLM generation process to distribute a fixed compute budget across the test queries, without requiring specialized training for the allocation mechanism. Our objective is to maximize the number of solved instances by dynamically allocating more compute to difficult instances and less to simpler ones, while adhering to a total budget constraint. We first introduce several training-free proxies and empirically demonstrate their effectiveness in estimating instance difficulty. We then design an adaptive allocation strategy guided by these proxies, which is theoretically grounded in a novel bandit formulation. Experiments across math (MATH, GSM8K), coding (LiveCodeBench), and Q\&A (e.g., GPQA-Diamond) benchmarks demonstrate that our method significantly outperforms both uniform budget allocation and training-based allocation baselines, solving substantially more problems under identical budget constraints. This work presents a practical and readily deployable approach to enhance the resource efficiency of LLM inference for demanding reasoning tasks.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 12848
Loading