\section{Introduction}\label{sec:intro}

% KEY:
% \begin{itemize}
%     \item \ada{(violet) for edited}
%     \item \old{(dark green) for the original source}
%     \item \add{(purple) for added content that mostly come from the review process}.
% \end{itemize}

\emph{Stochastic differential equations} (SDEs) are widely used to model the evolution of stochastic processes across various fields, including sciences, engineering, and finance.
In many of these applications, particularly in \emph{safety-critical} domains, a key concern is understanding how the state uncertainty in SDEs propagates over space and time.
This state uncertainty can be represented by probability density function (PDF), which is governed by the Fokker-Planck partial differential equation (FP-PDE).
However, analytical solutions for general FP-PDEs are unavailable, and numerical methods---e.g., finite elements or finite difference  \citep{spencer1993numerical,drozdov1996solution,masud2005application,pichler2013numerical,qian2019conservative,urena2020non}---are typically employed, but these methods scale poorly as the dimensionality grows beyond three \citep{tabandeh2022numerical}.
% However, solving the FP-PDE is generally computationally expensive and prone to numerical errors \citep{spencer1993numerical,drozdov1996solution,masud2005application,pichler2013numerical,qian2019conservative,urena2020non,tabandeh2022numerical}, except in simple cases.
Recent advancements in deep-learning suggest physics-informed learning frameworks, called \emph{physics-informed neural networks} (PINNs), can effectively learn PDE solutions, showing notable success in handling high-dimensional systems (up to 200 dimensions) and complex geometries \citep{sirignano2018dgm,lu2021deepxde}.
Despite their effectiveness, PINNs are still subject to approximation errors, 
% \ada{raising concern for applying PINNs-based PDE solvers in safety-critical scenarios.}
a crucial concern in safety-critical systems.
In this work, we tackle this challenge by developing a novel framework to approximate FP-PDE solutions using PINNs and rigorously bounding the approximation error.
% While PINNs can achieve high levels of approximation accuracy for high dimension systems, the absence of knowledge regarding the true PDE solution makes it impossible to assess the quality of the learned solution definitely. This limitation poses a major challenge to apply PINNs-based PDE solvers in real world scenarios.

% \ada{Some foundational studies (e.g. \citet{de2022generic,de2022error,mishra2023estimates,de2024error}) leverage functional analysis to quantify uncertainty of PINNs solution in terms of total error}
Recent works on using PINNs to approximate solutions to PDEs typically analyze approximation errors in terms of \emph{total} error, 
capturing cumulative approximation error across space and time
\citep{de2022generic,de2022error,mishra2023estimates,de2024error}.
While useful in some applications, this approach is less informative for SDEs and their PDF propagation. 
Moreover, total error bounds are often overly loose, sometimes exceeding the actual
errors by several orders of magnitude. Crucially, these bounds do not provide insight into the worst-case approximation error at specific time instances or within particular subsets of space, which
is essential in many stochastic systems. 
For example, in autonomous driving scenarios involving pedestrian crossings, accurate prediction and bounding the probability of collision requires precise reasoning over specific time instances and spatial regions. Loose over-approximations can lead to undesirable behaviors, such as sudden braking.

In this work, we show how PINNs can be used to approximate solutions to FP-PDE (i.e., PDF of an SDE's state) and, more importantly, introduce a framework for tightly bounding the worst-case approximation error 
over the subset of interest in state space as a function of time.
Our key insight is that the approximation error is related to the residual of the FP-PDE
and is governed by another PDE. 
Hence, a second PINN can be used to learn the error, with its own error also following a PDE. This results in a recursive formulation of error functions, each of which can be approximated using a PINN.  
We establish sufficient training conditions under which this series converges with a finite number of terms. Specifically, we prove that two PINNs are enough to obtain arbitrarily tight error bounds. Additionally, we derive a more practical bound requiring only one error PINN at the cost of losing arbitrary tightness, and provide a method to verify its sufficient condition. 
Furthermore, we propose a training scheme with regularization and discuss extensions to other linear PDEs. 
Finally, we illustrate and validate these error bounds through experiments on several SDEs, supporting our theoretical claims.

In short, the main contribution is five-fold:
\begin{itemize}
    \item a method for approximating the PDF of processes modeled by SDEs using PINNs,
    \item a novel approach to tightly bound the approximation error over time and space through a recursive series of error functions learned by PINNs,
    \item a proof that this recursive process converges with only two PINNs needed for arbitrarily tight bounds,
    \item the derivation of a more practical error bound requiring just one PINN, along with a method to verify its sufficiency, and
    \item validation of the proposed error bounds through experiments on several SDEs.
\end{itemize}

\paragraph{Related Work}
Research on using PINNs to approximate PDE solutions often focuses on total error, which represents the cumulative error across all time and space.
For instance, \citet{mishra2023estimates} derive an abstract total error bound. Nevertheless, their numerical experiments reveal that this total error bound is loose, exceeding the actual errors by nearly three orders of magnitude. 
% This approach is extended to Navier-stokes in \citep{de2024error}, showing similar results. 
A similar approach is extended to Navier-Stokes equations \citep{de2024error}, with comparable results.
\citet{de2022error} consider FP-PDEs derived from linear SDEs only. They propose an abstract approach to bound the total error, but no numerical experiments are presented.
\citet{de2022generic} also derive total error bounds for PINNs (and operators) assuming  a priori error estimate.
In contrast, our work emphasizes bounding the worst-case error at any time of interest for general SDEs, which is particularly valuable in practical applications of stochastic systems (e.g., systems subject to chance constraints \citep{oguri2021robust,paiola2024evaluation}).


% is to develop a general framework to tightly bound the worst-case error at any time of interest, as it allows for direct answers to the critical question: what is the maximum error of the approximate solution at any given time?

To demonstrate the approximation capabilities of neural networks, error analysis is a key.
% \old{Error analysis is a well-established area focused on demonstrating the approximation capabilities of neural networks.}
For example,
\citet{hornik1991approximation} proves that a standard multi-layer feed-forward neural network can approximate a target function arbitrarily well.
\citet{yarotsky2017error} considers the worst-case error and shows that deep ReLU neural networks are able to approximate universal functions in the Sobolev space.
Recently, deep operator nets (DeepONet) have been suggested to learn PDE operators, with
\citet{lanthaler2022error} proving that for every $\epsilon>0$, there exists DeepONets such that the total error is smaller than $\epsilon$.
While these studies show the capabilities of neural networks, they do not address the critical question: what are the quantified errors for a given neural network approximation? This is the central issue tackled by our work.

Error estimates have also been investigated when neural networks are trained as surrogate models for given target functions.
For instance, \citet{barron1994approximation} derives the error between the learned network and target function in terms of training configurations. To learn a latent function with quantified error, Gaussian process regression \citep{archambeau2007gaussian} is often employed, where observations of an underlying process are required to learn the mean and covariance.
Recently, \citet{yang2022guaranteed} estimate the worst-case error given target functions and neural network properties. 
Nevertheless, a fundamental difference between our work and these studies is that 
% we do not know the true solutions, nor do we assume data of the true underlying processes.
% A key distinction of our work is that 
we do not assume knowledge of the true solutions (latent functions) or rely on data from the underlying processes.

Solving PDEs is an active research area with various established approaches. For the FP-PDE equation, numerical methods, such as the finite elements, finite differences, or Galerkin projection methods, have been employed \citep{spencer1993numerical,drozdov1996solution,masud2005application,chakravorty2006homotopic,pichler2013numerical,qian2019conservative,urena2020non}.
For PDF propagation, instead of solving the FP-PDE, some approaches perform a time-discretization of the SDE and use  
% As such, for specific dynamical systems, model-based nonlinear uncertainty propagation methods, such as 
Gaussian mixture models \citep{terejanu2008uncertainty}. 
% or state transition tensor in \citep{fujimoto2012analytical}.
Recent works \citep{khoo2019solving, song2025finite, lin2024deep} employ numerical methods for approximating transition probability between two regions, which is also governed by the FP-PDE.
While these studies show accurate approximations from posterior evaluation, 
% they can be computationally demanding and often lack the ability to quantify and bound the error.
they can be computationally demanding and often lack rigorous error quantification and bounding.


% In contrast, our method for approximating solutions to the FP-PDE using PINNs is computationally tractable and centers on constructing error bounds for them.