Towards Reasoning Reuse: A New Paradigm in Model Collaboration

Published: 01 Mar 2026, Last Modified: 05 Apr 2026TTU at ICLR 2026 (Main)EveryoneRevisionsBibTeXCC BY 4.0
Abstract: While large language models (LLMs) have demonstrated remarkable capabilities through training-time scaling and test-time scaling, increased costs constrain their deployment in applications. Existing works have developed fine-grained collaboration frameworks with small language models (SLMs). Despite their success in balancing performance and cost, these frameworks are hard to deploy broadly, since they are typically trained for a specific set of models and assume white-box access to all collaborating models. We propose \textbf{reasoning reuse}, a training-free model collaboration framework via test-time updates: an LLM first generates a limited number of reasoning steps, and an SLM reuses these steps to continue inference. This setting includes a large design space over what the LLM emits and how the SLM reuses it. In this paper, we establish feasibility: our experiments show that an SLM has the ability to reuse an LLM’s reasoning steps. The ideas and findings in this work serve as an alternative framework for efficient language model collaboration at test time, paving the way for future work in this direction.
Submission Number: 67
Loading