Keywords: autoregressive models, video modeling, compute-adaptive inference, tokenization based artifacts, physics based PDEs
TL;DR: Overtone enables inference time compute adaptivity for PDE surrogate models, without losing accuracy. It enables novel inference time cyclic rollout strategies suppressing tokenization artifacts.
Abstract: Transformer-based PDE surrogates achieve remarkable performance but face two key challenges: fixed patch sizes cause systematic error accumulation at harmonic frequencies, and computational costs remain inflexible regardless of problem complexity or available resources. We introduce Overtone, a unified solution through dynamic patch size control at inference. Overtone's key insight is that cyclically modulating patch sizes during autoregressive rollouts distributes errors across the frequency spectrum, mitigating the systematic harmonic artifact accumulation that plague fixed-patch models. We implement this through two architecture-agnostic modules—CSM (Convolutional Stride Modulation, using dynamic stride modulation) and CKM (Convolutional Kernel Modulation, using dynamic kernel resizing)—that together provide both harmonic mitigation and compute-adaptive deployment. This flexible tokenization lets users trade accuracy for speed dynamically based on computational constraints, and the cyclic rollout strategy yields up to 40% lower long rollout error in variance-normalised RMSE (VRMSE) compared to conventional, static-patch surrogates. Across challenging 2D and 3D PDE benchmarks, one Overtone model matches or exceeds fixed-patch baselines across inference compute budgets, when trained under a fixed total training budget setting.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 22417
Loading