Free(): Your LLM is Leaking Memory

Free(): Your LLM is Leaking Memory

ACL ARR 2026 January Submission874 Authors

25 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Efficienct Reasoning

Abstract: Standard Large Language Models (LLMs) continuously accumulate tokens in their reasoning chains but lack a mechanism to release information that is no longer necessary for the final answer. This accumulation can populate the context window with redundant content, such as dead-end paths or transient verification steps, which can distract the attention mechanism and impede the coherence of long-form reasoning. In this paper, we introduce Free()LM, an architecture that integrates a \texttt{free()} function to actively manage reasoning context. We augment the base model with a lightweight, trainable Free-Module. During generation, this module is activated at regular intervals to output structured commands that identify and remove redundant segments of the reasoning trace. By dynamically pruning the context, Free()LM maintains a managed workspace throughout the inference process. Empirical results demonstrate that the Free-Module significantly enhances reasoning performance. Across six long-reasoning benchmarks, Free()LM improves Qwen3-8B and Qwen3-30B-A3B by an average of 4.4%. On Qwen3-235B-A22B, it yields an 11% relative gain on the Hard LLM Evaluation (HLE) benchmark. Notably, on complex instances requiring over 70k thinking tokens, Free()LM increases the accuracy of Qwen3-235B-A22B from 0% to 28%. These performance gains are accompanied by improved efficiency, as the approach reduces KV cache memory usage on HLE from 6.14GB to 3.34GB per sample. Our findings suggest that effective long-form reasoning depends not only on information retention but also on the strategic removal of redundant context.

Paper Type: Long

Research Area: Language Models

Research Area Keywords: chain-of-thought, robustness

Languages Studied: English

Submission Number: 874

Loading