Keywords: Large Language Models, In-Context Learning, Partial Differential Equations, Spatiotemporal Dynamics, Zero-Shot Extrapolation, Numerical Analysis, Interpretability, AI4Science
TL;DR: Text-trained LLMs can zero-shot extrapolate spatiotemporal dynamics of PDE solutions, exhibiting systematic in-context learning progression and behaviors analogous to classical numerical solvers.
Abstract: Large language models (LLMs) have demonstrated emergent in-context learning (ICL) capabilities across a range of tasks, including zero-shot time-series forecasting. We show that text-trained foundation models can accurately extrapolate spatiotemporal dynamics from discretized partial differential equation (PDE) solutions without fine-tuning or natural language prompting. Predictive accuracy improves with longer temporal contexts but degrades at finer spatial discretizations. In multi-step rollouts, where the model recursively predicts future spatial states over multiple time steps, errors grow algebraically with the time horizon, reminiscent of global error accumulation in classical finite-difference solvers. We interpret these trends as in-context neural scaling laws, where prediction quality varies predictably with both context length and output length. To better understand how LLMs are able to internally process PDE solutions so as to accurately roll them out, we analyze token-level output distributions and uncover a consistent ICL progression: beginning with syntactic pattern imitation, transitioning through an exploratory high-entropy phase, and culminating in confident, numerically grounded predictions.
Supplementary Material: zip
Primary Area: interpretability and explainable AI
Submission Number: 8150
Loading