Lost in the Maze: Overcoming Context Limitations in Long-Horizon Information-Seeking

ICLR 2026 Conference Submission20992 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: deep research, agents, long-context
Abstract: Long-horizon information seeking tasks require iteratively searching the web over long trajectories and synthesizing information across many sources, and is a key capability to enable powerful applications like deep research systems. In this work, we show that popular information-seeking frameworks struggle to scale to long trajectories primarily due to context mismanagement—they accumulate long, noisy content, hit context window and tool budgets, or stop early. We introduce SLIM (Simple Lightweight Information Management), a simple framework that separates retrieval into distinct search and browse tools and periodically summarizes the trajectory, keeping context concise while enabling longer, more focused searches. On long-horizon tasks, SLIM achieves comparable accuracy at substantially lower cost and with far fewer tool calls than strong open-source baselines across multiple base models. Specifically, with o3 as the base model, SLIM achieves 55% on BrowseComp and 31% on HLE, outperforming all open-source frameworks by 7 and 3 absolute points, respectively, while incurring 5x fewer tool calls. Finally, we release an automated fine-grained trajectory analysis pipeline and error taxonomy for characterizing long-horizon information-seeking frameworks; SLIM exhibits less hallucination and fewer unfocused searches than prior systems. We hope our analysis framework and simple tool design inform future long-horizon agents.
Supplementary Material: zip
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 20992
Loading