UPS: Efficiently Building Foundation Models for PDE Solving via Cross-Modal Adaptation

TMLR Paper3164 Authors

09 Aug 2024 (modified: 15 Nov 2024)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: We present Unified PDE Solvers (UPS), a data- and compute-efficient approach to developing unified neural operators for diverse families of spatiotemporal PDEs from various domains, dimensions, and resolutions. UPS embeds different PDEs into a shared representation space and processes them using a FNO-transformer architecture. Rather than training the network from scratch, which is data-demanding and computationally expensive, we warm-start the transformer from pretrained LLMs and perform explicit alignment to reduce the modality gap while improving data and compute efficiency. The cross-modal UPS achieves state-of-the-art results on a wide range of 1D and 2D PDE families from PDEBench, outperforming existing unified models using 4 times less data and 26 times less compute. Meanwhile, it is capable of few-shot transfer to unseen PDE families and coefficients.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: We have revised the manuscript in response to feedback from reviewers. Key updates include: 1. **More Ablation Studies:** We present additional ablation studies in Table 4 S3 to assess the contribution of pretrained LLMs beyond merely labeling PDE types. These studies utilized one-hot encodings and learnable random embeddings as alternative strategies, demonstrating the benefits of using pretrained text embeddings when processing PDE data. 2. **More Generalizability Results:** We added experiments using more challenging datasets, such as the 2D Navier-Stokes equations, to Table 2 and Table 3 of the paper. These experiments aimed to showcase the model's ability to generalize to unseen PDEs and provided evidence of its few-shot learning capabilities. 3. **Computational Efficiency Metrics:** Additional efficiency metrics, including FLOPs and inference times, were reported in Appendix A3.2 to better quantify the efficiency of our method. 4. **Clarifications and Corrections:** Various sections of the manuscript were revised for clarity and correctness. Notable changes include correcting notation inconsistencies and refining the description of the model’s evaluation methods.
Code: https://github.com/sjunhongshen/UnifiedPDESolvers
Assigned Action Editor: ~Markus_Heinonen1
Submission Number: 3164
Loading