Meta-Harness: Post-Training Reliable Agent Systems via Harness Search
Keywords: Agent systems, post-training, harness engineering, execution traces, long-horizon evaluation
Abstract: Trustworthiness of open-ended AI agents depend not only on model weights, but also on the surrounding harness: the code that determines what information to store, retrieve, and present to the model. We introduce Meta-Harness, a system that searches over this agent-systems layer using execution traces from prior candidates rather than compressed summaries or scalar scores. A coding-agent proposer reads source code, scores, and traces from a filesystem, enabling selective diagnosis of long-horizon failures in realistic agent settings. On TerminalBench-2, a realistic benchmark for computer-use agents, the discovered harness surpasses Terminus-KIRA on Claude Opus 4.6 and ranks #1 among reported Claude Haiku 4.5 agents. On online text classification, Meta-Harness improves over a state-of-the-art context-management system by 7.7 points while using 4x fewer context tokens. On retrieval-augmented math reasoning, a single discovered harness improves accuracy on 200 IMO-level problems by 4.7 points on average across five held-out models. These results suggest that building more reliable agents in the wild may require post-training the surrounding agent system, not only the base model.
Track: Regular Paper (9 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 63
Loading