\section{Problem Statement and related work}
Certified robustness refers to the ability of a neural network to consistently classify inputs correctly within a specified range of perturbations. Unlike empirical robustness, which is tested through experiments and simulations, certified robustness provides theoretical assurances, ensuring that the network’s predictions remain unchanged for perturbations below a certain magnitude.
Various approaches have been developed to certify the robustness of neural networks.

\emph{Complete Verification} provides formal guarantees of robustness by exhaustively analyzing all possible perturbations within a given range. \citet{reluplex} proposed the first exact verification method for Neural Networks, using tools from Satisfiability Modulo Theories (SMT). Notably, they prove in the same work that this is an NP-complete problem.

\emph{Incomplete Verification Methods} use conservative approximations. They are computationally efficient but incomplete. Interval bound propagation and abstract interpretation are prominent examples~\cite{NEURIPS2018_f2f44698}.

\emph{Probabilistic Assessment} resorts to random simulations with statistical guarantees on the probability of failure
under a certain noise distribution~\cite{webb2018statistical}.
Some combines these with formal methods~\cite{proven}.
Our method pertains to this family.


\subsection{Probabilistic Assessment}
Consider a trained neural network classifier $f:[0,1]^d \to [0,1]^C$ mapping an input to a probability vector for $C$ classes and a clean input $\vecx_0$ which is well classified: $\arg \max_{1\leq i \leq C} f_i(\vecx_0) = c$, where $c$ is the ground truth class. The question is whether a random perturbation, modeling uncertainties on the input measurement, can cause a misclassification.

The approach of~\cite{webb2018statistical} is to cast this issue as a probability measure. Assuming a statistical model of a random additive perturbation $\N$, the objective is to compute the probability of failure (i.e. misclassicification). We introduce the random input $\X = \vecx_0 + \N$ whose distribution is denoted $\pi$. The probability of a failure is defined as
\begin{equation}
    \label{eq:DefProbFail}
    \pfail(\pi) \defeq \int_{[0,1]^d} \ind{h(\vecx)\geq 0} \pi(d\vecx), 
\end{equation}
where $h:[0,1]^d \to [-1,1]$ computes how close an input is from a misclassification.
For instance,
\begin{equation}
\label{eq:h_function}
    h(\vecx) \defeq \max_{i\in [1:C], i\neq c} f_i(\vecx) - f_c(\vecx).
\end{equation}
$h(\vecx)>0$ indicates that $\vecx$ is not classified as class $c$, the ground truth of $\vecx_0$.

%Following the approach of \cite{webb2018statistical}, given a trained neural network $f$, and a with distribution $\pi$, centered on a clean input $\vecx_0$, modeling the input uncertainties, we wish to estimate or bound the local probability of failure $\pfail(\pi)$ defined as:

\subsection{Related Works}
%\subsection{Cross-Entropy Adaptive IS}

Recent machine learning papers dealing with local robustness against uncertainties ignore the literature of Statistical Reliability Engineering and refer more to works in the field of Rare Event Simulation. The workhorse is mainly the Sequential Monte Carlo (SMC) (also knows as Adaptive Multilevel Splitting (AMS)) family of algorithms~\citep{beck_mls,amshistory}.

As far as we know, \citet{webb2018statistical} are the first to use an SMC simulation to estimate the probability of failure of deep NNs. \citet{efficient} use a variant that is faster but only predicts whether the probability of failure is below a critical level. The method has some statistical guarantees and is efficient since the reported critical level can be as low as $10^{-50}$.

\citet{scalable_verification} use the Crude Monte Carlo simulation though within a sequential testing scheme \citep{seq_testing}, that increases the computational budget adaptively. It comes with robust non-asymptotical guarantees but in practice only works for high critical levels, typically greater than $10^{-3}$.
These methods need the statistical model of the uncertainties, and also the function $h$~\eqref{eq:h_function} (if working in the input space) or function $G$~\eqref{eq:PfailUspace} (if working in the U-space) as a black box.

\cite{titaistats} propose a new SMC-like algorithm tailored for NNs: it exploits the gradient $\nabla G(\vecu)$ which is easy to compute for \emph{white box} NNs thanks to auto-differentation via backpropagation. 

However, all these variants of SMC consume a lot of calls to the neural network function. Indeed, the total number of calls is generally on the order of \emph{hundreds of thousands} for making a statement about the probability of failure around a \emph{single} input $\vecx_0$.  In contrast, our method, under assumptions we detail, gives reliable estimations in a few thousand calls.
