Keywords: Large langauge models, ODE, Solvers
TL;DR: We propose using large language models as "SciML Agents" that generate scientifically appropriate code to solve ODEs.
Abstract: A large body of recent work in scientific machine learning (SciML) aims to tackle scientific tasks directly by predicting target values with neural networks (e.g., physics-informed neural networks, neural ODEs, neural operators, etc.), but attaining high accuracy and robustness has been challenging.
We explore an alternative view: use large language models (LLMs) to write code that leverages decades of numerical algorithms.
The diagnostic set contains problems whose superficial appearance suggests stiffness, and that require algebraic simplification to demonstrate non-stiffness; and the large-scale benchmark spans stiff and non-stiff ODE regimes.
We evaluate open- and closed-source LLM models along two axes: (i) unguided versus guided prompting with domain-specific knowledge; and (ii) off-the-shelf versus fine-tuned variants.
Our evaluation measures both executability and numerical validity against reference solutions.
We find that with sufficient context and guided prompts, newer instruction-following models achieve high accuracy on both criteria. In many cases, recent open-source systems (e.g., the Qwen3 family) perform strongly without fine-tuning, while older or smaller models still benefit from fine-tuning. Overall, our preliminary results indicate that careful prompting and fine-tuning can yield a specialized LLM agent capable of reliably solving simple ODE problems.
Submission Number: 94
Loading