SENPAI: Self-ExperimentatioN for Physical AI An Observability-Based Research Harness

Morgan McGuire; Thomas Capelle; Justin Hodges

SENPAI: Self-ExperimentatioN for Physical AI An Observability-Based Research Harness

Morgan McGuire, Thomas Capelle, Justin Hodges

Published: 30 May 2026, Last Modified: 30 May 2026ICML2026-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Track 1: Original Research/Position/Education/Attention Track

Keywords: Physical AI, CFD, Transolver, Computational Fluid Dynamics, TandemFoilSet, AirfRANS, DrivAerML, autoresearch, SENPAI, aerodynamics, ML surrogate, CFD surrogate, physics-aware

TL;DR: A agent-powered training harness using observability-first memory and orchestration generates strong CFD results for across a range of aerodynamics-focussed benchmarks.

Abstract: SENPAI (Self-Experimentation for Physical AI) is an observability-first research harness in which multi-agent state is grounded in pull requests and structured experiment logs rather than agent memory and scratchpads. Although task-agnostic, we evaluate SENPAI on CFD-surrogate recipe search, where performance depends on architecture, optimization, losses, normalization, and physics-aware considerations. SENPAI uses a thin Advisor/Student loop to turn hypotheses into PRs, training runs, and PR comments, producing an experiment ledger queryable by both agents and researchers. This makes experiments auditable, recoverable, and researcher-steerable. We evaluate SENPAI across DrivAerML, AirfRANS, and TandemFoilSet. Starting from the original Transolver model, SENPAI improves DrivAerML surface-pressure and wall-shear-stress relative-L2 error, reduces AirfRANS surface-MSE below all compared references, and reaches lower TandemFoilSet normalized full-field MSE on the cruise-random-uniform split than the reported benchmark. Overall, SENPAI supports sparsely steered, multi-day semi-autonomous research that can deliver strong task-specific models. We will release the harness and full experiment ledger of all experiments undertaken.

Submission Number: 181

Loading