Abstract: Reconstructing continuous physical fields from sparse, irregular observations is a fundamental challenge
in scientific machine learning, particularly for nonlinear systems governed by partial differential equations
(PDEs). Dominant physics-informed approaches enforce governing equations as soft penalty terms during
optimization, a strategy that often leads to gradient imbalance, instability, and degraded physical consistency
when measurements are scarce. Here we introduce the Physics-Guided Transformer (PGT), a neural
architecture that moves beyond residual regularization by embedding physical structure directly into the
self-attention mechanism. Specifically, PGT incorporates a heat-kernel–derived additive bias into attention
logits, endowing the encoder with an inductive bias consistent with diffusion physics and temporal causality.
Query coordinates attend to these physics-conditioned context tokens, and the resulting features drive a
FiLM-modulated sinusoidal implicit decoder that adaptively controls spectral response based on the inferred
global context. We evaluate PGT on two canonical benchmark systems spanning diffusion-dominated and
convection-dominated regimes: the one-dimensional heat equation and the two-dimensional incompressible
Navier–Stokes equations. In 1D sparse reconstruction with as few as 100 observations, PGT attains a relative
L2 error of 5.9×10−3, representing a 38-fold reduction over physics-informed neural networks and more than
90-fold reduction over sinusoidal implicit representations. In the 2D cylinder-wake problem reconstructed
from 1500 scattered spatiotemporal samples, PGT uniquely achieves strong performance on both axes of
evaluation: a governing-equation residual of 8.3 × 10−4 — on par with the best residual-based methods —
alongside a competitive overall relative L2 error of 0.034, substantially below all methods that achieve com
parable physical consistency. No individual baseline simultaneously satisfies these dual criteria. Convergence
analysis further reveals sustained, monotonic error reduction in PGT, in contrast to the early optimization
plateaus observed in residual-based approaches. These findings demonstrate that structural incorporation
of physical priors at the representational level, rather than solely as an external loss penalty, substantially
improves both optimization stability and physical coherence under data-scarce conditions. Physics-guided
attention provides a principled and extensible mechanism for reliable reconstruction of nonlinear dynamical
systems governed by partial differential equations.
Loading