Putting It All into Context: Simplifying Agents with LCLMs

Mingjian Jiang; Yangjun Ruan; Luis A. Lastras; Pavan Kapanipathi; Tatsunori Hashimoto

Putting It All into Context: Simplifying Agents with LCLMs

Mingjian Jiang, Yangjun Ruan, Luis A. Lastras, Pavan Kapanipathi, Tatsunori Hashimoto

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Large language model; Agent; Coding; Long-context Language Model

TL;DR: This work proposes a scaffolding-free approach to LM agents by using long-context LMs to fully observe environments and generate actions directly, achieving strong results on SWE-Bench-Verified.

Abstract: Recent advances in language model (LM) agents have demonstrated significant potential for automating complex real-world tasks. To make progress on these difficult tasks, LM agent architectures have become increasingly complex, often incorporating multi-step retrieval tools, multiple agents, and scaffolding adapted to the underlying LM. In this work, we investigate whether all of this complexity is necessary, or if parts of these scaffolds can be removed on challenging tasks like SWE-bench. We show that in the case of SWE-bench, simply putting the entire environment into the context of a long context language model (LCLM) and properly prompting the model makes it competitive with carefully tuned, complex agent scaffolds. We show that a Gemini-1.5-Pro model without any scaffolding or tools achieves 38\% on SWE-Bench-Verified, comparable with approaches using carefully tuned agent scaffolds (32\%). While the unscaffolded approach with Gemini-1.5-Pro falls short of the strongest agentic architectures, we demonstrate that the more capable Gemini-2.5-Pro using the same unscaffolded approach directly attains a 50.8\% solve rate. Additionally, a two-stage approach combining Gemini-1.5-Pro with Claude-3.7 achieves a competitive 48.6\% solve rate. These results suggest that LCLMs can enable a more monolithic design, reducing reliance on exploration scaffolds in fully observable regimes.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 22274

Loading