We Let Agents Compete and They Tried to Cheat. KernelGuard:Defending GPU Competitions from Adversarial Agentic Systems
Keywords: gpu, kernel, verification, reward hack
TL;DR: We introduce KernelGuard, a system for guarding against reward hacks in GPU kernel optimization benchmarks
Abstract: KernelBot showed that open, interactive GPU-kernel competitions can produce useful accelerator code, but also exposed cheat-proofing reward-hacked kernels as an unsolved issue. In this setting where evaluation code is public, we find several exploits ranging from timer manipulation to physics-impossible scoring are able to bypass the verifier. To combat these hacks, we present KernelGuard, an integrity layer for agentic kernel benchmarks that is deployed in a live evaluation setting involving real participants and coding agents. KernelGuard is designed around expert-seeded static rules, conservative physics-floor checks, and a tool-use adversarial large language model (LLM) judge whose high-confidence findings are promoted into cheap, auditable rules. On live traffic across three AMD kernel optimization competitions from KernelBot, we observed the hacked-submission rate fall from $3.45\%$ ($4{,}889/141{,}800$) to $0.37\%$ ($152/40{,}998$) after KernelGuard was integrated over a 32-day, 182,798-submission window. KernelGuard treats production benchmark integrity as a live problem: it requires layered defenses whose expensive agentic components continuously distill into deterministic checks.
Track: Regular Paper (9 pages)
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 239
Loading