Scaling Long-Horizon Agent via Context Folding

Scaling Long-Horizon Agent via Context Folding

ICLR 2026 Conference Submission21870 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Agent, Context Engineering

TL;DR: We propose Context Folding, where agents fold completed subtasks into brief summaries to save context on long tasks.

Abstract: Large language model (LLM) agents are fundamentally constrained by context length on long-horizon tasks. Existing agent frameworks usually rely on manually defined context engineering pipelines, such as multi-agent or post-hoc summary. We introduce Context Folding, a framework that empowers agents to actively manage their working context. An agent can procedurally branch into a sub-trajectory to handle a subtask and then fold it upon completion, collapsing the intermediate steps while retaining a concise summary of the outcome. To make this behavior learnable, we propose FoldPO, an end-to-end reinforcement learning framework with specific process rewards to encourage effective task decomposition and context management. On complex long-horizon tasks, our agent matches the performance of baselines while using an active context up to 10$\times$ smaller, and significantly outperforms models constrained to the same context size.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 21870

Loading