Track: Track 1: Original Research/Position/Education/Attention Track
TL;DR: An AI agent supervises experimental design in regulatory genomics by iteratively improving MPRA libraries to produce more informative data for downstream training of predictive models.
Abstract: Designing maximally informative experiments remains a central challenge in genomics. In regulatory genomics, massively parallel reporter assays (MPRAs) enable large-scale functional measurements but require carefully constructed sequence libraries to be maximally informative. We introduce an agent-based framework for MPRA library design in which a large language model autonomously constructs and improves MPRA libraries designed with the goal of training predictive models of regulatory activity. Candidate libraries are evaluated using a high-fidelity surrogate of MPRA measurements, enabling rapid closed-loop optimization under realistic experimental constraints in a simulated wet-lab/dry-lab loop. Across independent runs, agent-designed libraries outperform a set of human-designed strategies and yield consistent improvements in predictive performance. In addition, the agent recovers interpretable design principles. These results suggest that AI agents can contribute meaningfully to experimental design in regulatory genomics and provide a reusable \textit{in silico} benchmark for evaluating AI co-scientist workflows.
Keywords: AI agents, large language models, experimental design, regulatory genomics, massively parallel reporter assays, predictive modeling, autonomous science
Submission Number: 145
Loading