R-Capsule: Compressing High-Level Plans for Efficient Large Language Model Reasoning

09 Sept 2025 (modified: 09 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model, latent reasoning
Abstract: Chain-of-Thought (CoT) prompting has enabled Large Language Models (LLMs) to tackle complex reasoning tasks by generating explicit step-by-step rationales. However, this verbosity incurs significant computational overhead in terms of latency and memory, and can lead to error propagation over long reasoning chains. We propose the \textbf{Reasoning Capsule}, a novel framework that captures the efficiency of latent reasoning while retaining the transparency of explicit CoT. Our core idea is to compress the high-level strategic plan of a reasoning process into a compact, low-dimensional latent representation---the Reasoning Capsule---while leaving the low-level execution steps explicit. This hybrid approach is grounded in the Information Bottleneck principle, where we learn a capsule that is a \emph{minimal sufficient statistic} for the reasoning task. Minimality is enforced structurally via a low-dimensional bottleneck, ensuring efficiency. Sufficiency is enforced via a dual-objective function: a primary task loss for answer accuracy and an auxiliary reconstruction loss that ensures the capsule faithfully represents the original textual plan. This reconstruction objective grounds the latent space, making the compressed plan interpretable and robust against uninformative shortcuts. Our framework unifies efficiency, accuracy, and interpretability, significantly reducing the token footprint of reasoning while maintaining or improving performance on complex reasoning benchmarks.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 3349
Loading