Credit-Budgeted ICPC-Style Coding: When LLM Agents Must Pay for Every Decision

Lingfeng Zhou; Junhao Shi; Jin Gao; Dequan Wang

Credit-Budgeted ICPC-Style Coding: When LLM Agents Must Pay for Every Decision

Lingfeng Zhou, Junhao Shi, Jin Gao, Dequan Wang

Published: 23 Sept 2025, Last Modified: 26 Nov 2025LAWEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Coding Agents, Large Language Models, Agent Evaluation, Interactive Environment

TL;DR: We build USACOArena, a competitive programming arena to evaluate coding agents' decision-making skills under resource constraints, revealing strategic profiles that go beyond simple code correctness.

Abstract: Contemporary coding-agent benchmarks applaud “first correct answer”, silently assuming infinite tokens, container minutes, and developer patience. In production, every LLM call, test re-run, and rollback incurs hard cost; agents that cannot budget these resources are dead on arrival. We close the gap with USACOArena, an ICPC-inspired arena where agents pay deterministic credits for every prompt, compilation, test, or rollback. A task becomes a cost–benefit negotiation under uncertainty: is a second sample worth 15% of the remaining budget, or should the agent pivot to a cheaper heuristic? Real-time deduction exposes decision profiles hidden from static leaderboards: the tax of over-specialized generators, the ROI of early-exit heuristics, and the compound interest of lightweight scaffolding. Even identically seeded agents diverge in self-play, revealing a rich policy space where the same model oscillates between spendthrift submission sprees and parsimonious exploration. Released as a reproducible benchmark and zero-shot curriculum, USACOArena provides the traces, credit engine, and six state-of-the-art decision logs to catalyze research on coding agents that know when to stop.

Submission Type: Research Paper (4-9 Pages)

Submission Number: 34

Loading