Keywords: AI for Science, Agentic AI, Code Optimization, AutoML
TL;DR: We present TusoAI, an agentic approach to constructing scientific methods, either from scratch, or improving upon a state-of-the-art tool.
Abstract: Scientific discovery is often slowed by the manual development of computational
tools needed to analyze complex experimental data. Building such tools is costly
and time-consuming because scientists must iteratively review literature, test mod-
eling and scientific assumptions against empirical data, and implement these in-
sights into efficient software. Large language models (LLMs) have demonstrated
strong capabilities in synthesizing literature, reasoning with empirical data, and
generating domain-specific code, offering new opportunities to accelerate com-
putational method development. Existing LLM-based systems either focus on
performing scientific analyses using existing computational methods or on de-
veloping computational methods or models for general machine learning without
effectively integrating the often unstructured knowledge specific to scientific do-
mains. Here, we introduce TusoAI, an agentic AI system that takes a scientific task
description with an evaluation function and autonomously develops and optimizes
computational methods for the application. TusoAI integrates domain knowledge
into a knowledge tree representation and performs iterative, domain-specific op-
timization and model diagnosis, improving performance over a pool of candidate
solutions. We conducted comprehensive benchmark evaluations demonstrating
that TusoAI outperforms state-of-the-art expert methods, MLE agents, and scien-
tific AI agents across diverse tasks. Applying TusoAI to two key open problems
in genetics improved existing computational methods and uncovered new biology
missed by previous methods.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 20069
Loading