Keywords: scientific founcation model, in-context learning, zero-shot, noisy prior data
Abstract: Recent advancements in scientific machine learning have begun to explore the potential of scientific foundation models (SFMs). Inspired by the in-context learning (ICL) framework of large language models (LLMs), we leverage prior data and pre-training techniques to construct our SFM. It has been demonstrated that ICL in LLMs can perform Bayesian inference, resulting in strong generalization capabilities. Furthermore, LLMs do not exhibit intrinsic inductive bias; rather, they inherit bias from the prior data, as confirmed experimentally.
Building upon these insights, our methodology is structured as follows: (i) we collect prior data in the form of solutions of partial differential equations (PDEs) constructed by an arbitrary linear combination of mathematical dictionaries, (ii) we utilize Transformer architectures with self-attention and cross-attention mechanisms to predict PDE solutions without knowledge of the governing equations in a zero-shot setting, and (iii) we provide experimental evidence on the one dimensional convection-diffusion-reaction equation, which demonstrate that pre-training remains robust even with noisy prior data, with only marginal impacts on test accuracy. Notably, this finding opens the path to pre-training SFMs with realistic, low-cost data instead of, or in conjunction with, numerical high-cost data. These results support the conjecture that SFMs can improve in a manner similar to LLMs, where it is nearly impossible to fully clean the vast set of sentences crawled from the Internet.
Supplementary Material: zip
Submission Number: 42
Loading