Keywords: Science, AI
TL;DR: An AI first-author agent that generates, compiles, and profiles Triton/CUDA kernels from high-level prompts or code—choosing the fastest correct design with full provenance.
Abstract: We introduce gburdell3-agent, the first fully autonomous AI author for GPU kernels.
Given only a high-level spec, the agent cycles through hypothesis→code→compile→benchmark→verify→write, delivering production-ready kernels and a camera-ready paper without human intervention or fabricated data.
All 250 experiments (row-softmax, 2-D stencils, particle filters, KernelBench) run on a single NVIDIA H100-SXM5-80 GB inside a locked-down sandbox (>10^5 trials/day, zero network, immutable logs).
Eight falsifiable hypotheses are tested; six are confirmed, two rejected, every number linked to an auditable log line.
The agent beats PyTorch eager by up to 1.91×, matches or exceeds vendor libraries.
Kernel optimisation is thus shown to be an ideal microcosm for cheap-oracle, AI-led science.
Submission Number: 191
Loading