Keywords: large language models, reasoning
TL;DR: We propose Adaptive Parallel Reasoning (APR), a reinforcement-learning-optimized inference framework allowing language models to dynamically balance serial and parallel computations, significantly enhancing reasoning accuracy and efficiency.
Abstract: Scaling inference-time computation has substantially improved the reasoning capabilities of language models. However, existing methods have
significant limitations: serialized chain-of-thought approaches generate
overly long outputs, leading to increased latency and exhausted context
windows, while parallel methods such as self-consistency suffer from insufficient coordination, resulting in redundant computations and limited
performance gains. To address these shortcomings, we propose Adaptive Parallel Reasoning (APR), a novel reasoning framework that enables
language models to orchestrate both serialized and parallel computations
end-to-end. APR generalizes existing reasoning methods by enabling adaptive multi-threaded inference using spawn() and join() operations. A key
innovation is our end-to-end reinforcement learning strategy, optimizing
both parent and child inference threads to enhance task success rate without
requiring predefined reasoning structures. Experiments on the Countdown
reasoning task demonstrate significant benefits of APR: (1) higher performance within the same context window (83.4% vs. 60.0% at 4k context);
(2) superior scalability with increased computation (80.1% vs. 66.6% at 20k
total tokens); (3) improved accuracy at equivalent latency (75.2% vs. 57.3%
at approximately 5,000ms). APR represents a step towards enabling language models to autonomously optimize their reasoning processes through
adaptive allocation of computation.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 870
Loading