
\section{Robustness to Adversary}\label{sec:adversary}

In this section, we will turn the \textsc{Query} algorithm into a robust one. In other words, we want the following thing to happen with high probability: the algorithm responds to all query points correctly. 
We achieve this goal by taking three steps. We start with constant success probability for the \textsc{Query} procedure, which we have proved in the previous section. 
In the first step, we boost this constant probability to a high probability by applying the median technique. We note that the current algorithm succeeds with high probability only for one fixed point but we want it to respond to arbitrary query points correctly. 

It is not an easy task to generalize directly from a fixed point to infinite points in the whole space. Thus we take a middle step by introducing unit ball and $\epsilon$-net. We say a unit ball in $\R^d$ is a collection of points whose norm is less than or equal to $1$.  An $\epsilon$-net is a finite collection of points, called net points, that has the ``covering'' property. To be more specific, the union of balls that centered at net points with radius $\epsilon$ covers the unit ball. In the second step, we show that given a net of the unit ball, we have the correctness on all net points.
Finally, we show the correctness of the algorithm from net points to all points in the whole space. Then we obtain a robust algorithm.

 


 

{\bf Starting Point} In Section~\ref{sec:correctness}, we have already obtained a query algorithm with constant success probability for a fixed query point.

 
\begin{lemma}[Starting with constant probability]\label{lem:single_estimator}
Given $\epsilon \in (0,0.1)$, a query point $q \in \R^d$ and a set of data points $X = \{ x_{i}\}_{i=1}^{n} \subset \R^d$, let  
\begin{align*} 
% $
f_{\mathsf{KDE}}^*(q) := \frac{1}{|X|} \sum_{x \in X} f(x,q)
% $
\end{align*}
be
an estimator $\mathcal{D}$ can answer the query satisfing
% \begin{align*}
$(1-\epsilon) \cdot f_{\mathsf{KDE}}^*(q) \leq \mathcal{D}.\textsc{query}(q, \epsilon) \leq (1 + \epsilon)\cdot f_{\mathsf{KDE}}^*(q)$
% \end{align*}
with probability $0.9$.
\end{lemma}

\paragraph{Boost the constant probability to high probability.}
 
Next, we begin to boost the success probability by repeating the query procedure and taking the median output.

\begin{lemma}[Boost the constant probability to high probability]\label{lem:fixed_points}
Let $\delta_1 \in (0,0.1)$ denote the failure probability. Let $\epsilon \in (0,0.1)$ denote the accuracy parameter.
Given $L = O( \log(1/\delta_1) )$  estimators $\{\mathcal{D}_j\}_{j=1}^{L}$. For each fixed query point $q \in \R^d$, the median of queries from $L$ estimators satisfies that:
\begin{align*}
     (1-\epsilon) \cdot f_{\mathsf{KDE}}^*(q) 
    \leq & ~ \mathrm{Median}(\{\mathcal{D}_j.\textsc{query}(q, \epsilon)\}_{j=1}^{L}) \\
    \leq & ~ (1 + \epsilon)\cdot f_{\mathsf{KDE}}^*(q)
\end{align*}
with probability $1 - \delta_1$.
\end{lemma}



\paragraph{From each fixed point to all the net points.}
 
So far, the success probability of our algorithm is still for a fixed point. We will introduce $\epsilon$-net on a unit ball and show the high success probability for all the net points. 

\begin{fact}\label{fac:number_of_net_points}
Let $N$ denote the $\epsilon_0$-net of 
\begin{align*}
% $
\{ x \in \R^d ~|~ \| x \|_2 \leq 1 \}.
% $
\end{align*}
We use $|N|$ to denote the number of points in $N$. Then $|N|\leq (10/\epsilon_0)^d$.
\end{fact}

This fact shows that we can bound the size of an $\epsilon$-net with an inverse of $\epsilon$. We use this fact to conclude the number of repetitions we need to obtain the correctness of \textsc{Query} on all net points.


\begin{lemma}[From each fixed points to all the net points]\label{lem:net_points}
Let $N$ denote the $\epsilon_0$-net of 
%\begin{align*} 
$
\{ x \in \R^d ~|~ \| x \|_2 \leq 1 \}.
$
%\end{align*}
We use $|N|$ to denote the number of points in $N$. Given $L = \log(|N|/\delta)$  estimators $\{\mathcal{D}_j\}_{j=1}^{L}$. 
With probability $1 - \delta$, we have: for all $q \in N$, the median of queries from $L$ estimators satisfies that:
\begin{align*}
   (1-\epsilon) \cdot f_{\mathsf{KDE}}^*(q) 
   \leq & ~ \mathrm{Median}(\{\mathcal{D}_j.\textsc{query}(q, \epsilon)\}_{j=1}^{L}) \\
   \leq & ~ (1 + \epsilon)\cdot f_{\mathsf{KDE}}^*(q).
\end{align*}
\end{lemma}

\paragraph{From net points to all points.}
With Lemma~\ref{lem:net_points}, we are ready to extend the correctness for net points to the whole unit ball. We demonstrate that all query points  
$\| q \|_2 \leq 1$ can be answered approximately with high probability in the following lemma.


\begin{lemma}[From net points to all points]\label{lem:from_net_points_to_all_points}
Let $\epsilon \in (0,0.1)$. Let ${\cal L} \geq 1$. Let $\delta \in (0,0.1)$. Let $\tau \in [0,1]$. 
Given $L = O(\log(( \mathcal{L}/\epsilon \tau )^d/\delta))$  estimators $\{\mathcal{D}_j\}_{j=1}^{L}$, with probability $1 - \delta$, for all query points $\|p\|_2 \leq 1$,  we have the median of queries from $L$ estimators satisfies that: $\forall \| p\|_2 \leq 1$
\begin{align*}
   % : \\
    (1-\epsilon) \cdot f_{\mathsf{KDE}}^*(p) 
    \leq & ~ \mathrm{Median}(\{\mathcal{D}_j.\textsc{query}(q, \epsilon)\}_{j=1}^{L}) \\
    \leq & ~ (1 + \epsilon)\cdot  f_{\mathsf{KDE}}^*(p).
\end{align*}
where $q$ is the closest net point of $p$.

\end{lemma}

Thus, we obtain an algorithm that could respond to adversary queries robustly.

