\section{ADAPTIVE SEARCH ALGORITHM}\label{sec:search}







We now consider a scenario where the search space $\Omega$ is a hypercube in $\mathbb{R}^\embdim$ and where any $\targetx \in \Omega$ can be the target.
In this continuous setting, a search algorithm should be able to "zoom in" indefinitely, thus finding ever smaller regions containing the target.

Having access to a scale-free oracle enables us to ask queries where the response noise is independent of the current scale (or "zoom level").
A constant level of noise in the oracle's answers means that shrinking the current region incurs constant expected cost in terms of the number of queries asked.
This suggests an exponential rate of convergence, as long as we can ensure that the search does not permanently veer away from the target.
For the formal analysis, we assume that items are dense within the feature space, so that the target and query items may be at any location. This allows us to reason about the speed of convergence of the search process towards the target, instead of a stopping time of that process over a finite set of items.
In real-world scenarios with a finite number of items, we believe the theory gives us the following insight. Starting with a large number of items, we expect the situation to be similar to the dense case we study theoretically, and informally we expect the algorithm to be able to ``zoom in'' on the target with an exponential rate of convergence, until it has arrived at a zoom level at which the dataset begins to look sparse. At that point, the theory is no longer applicable, but we expect to have filtered the search space down to a small number of items, such that identifying the target object among the remaining items is much easier to do.



Our algorithm operates in stages. At each stage, the algorithm investigates a region by submitting queries until a decision to zoom in or backtrack can be made.
This decision is based only on information collected in the current stage.
At the beginning of a stage, all knowledge about prior queries and oracle replies is discarded, and the only state of the algorithm is the current region.
Due to this conditional independence of decisions, we find that the sequence of regions visited by our algorithm is Markovian.
We frame our search process as a random walk on a graph, where each node corresponds to a region $\region \in \Omega$.
Under mild assumptions on the transition probabilities between regions, an erroneous decision, i.e., zooming into a region that does not contain the target, must eventually be undone with probability 1.
A high-level overview of this idea is given in Algorithm \ref{alg:outer_inner_loop}.
Constructing a stochastic coupling between a counting process and the random walk on regions enables us to analyze the probability of consecutive errors and to explicitly calculate recurrence times.
We prove an exponential rate of convergence in Section \ref{sec:outerloop}.

We show that with access to a $\gamma$-CKL model, the assumptions on transition probabilities can always be satisfied.
In Section \ref{sec:inner_loop}, to ensure only a constant number of queries is needed in each stage, we present a query scheme that relies on the properties of a scale-free oracle.
To facilitate a formal proof, this scheme is based on hypothesis testing.
We discuss an efficient implementation based on numerical integration in Section \ref{sec:implementation}.


\begin{algorithm}[t]
    \begin{algorithmic}
    \STATE {\bfseries Input:} query budget $\budget$
    \STATE $\searchstep \gets 0$ \COMMENT{stage number}
    \STATE $\region_\searchstep \gets \Omega$ \COMMENT{current region}
    \STATE $\oraclestep \gets 0$ \COMMENT{number of queries asked}
    \REPEAT
      \STATE $\mathcal{D} \gets \{\}$
      \COMMENT{Drop previous observations}
      \REPEAT %
        \STATE $(\hat{\mathbf{x}}_{i}, \hat{\mathbf{x}}_j) \gets \text{nextQuery}(\region_\searchstep, \mathcal{D})$
        \STATE $\hat{y} \sim \text{Bernoulli}(p_{\hat{\mathbf{x}}_{i}, \hat{\mathbf{x}}_j, \mathbf{x}_t})$
        \STATE $\mathcal{D} \gets \mathcal{D} \cup \{(\hat{\mathbf{x}}_{i}, \hat{\mathbf{x}}_j, \hat{y})\}$
        \STATE $\oraclestep \gets \oraclestep + 1$
      \UNTIL{$\text{decisionReady}(\region_\searchstep, \mathcal{D}) = \text{true}$}
      \STATE $\region_{\searchstep + 1} \gets$ zoom into / out of $\region_\searchstep$ based on $\mathcal{D}$
      \STATE $\searchstep \gets \searchstep + 1$
    \UNTIL{$\oraclestep > \budget$}
  
    \end{algorithmic}
    \caption[]{Exponential search algorithm}
    \label{alg:outer_inner_loop}
    \end{algorithm}


\subsection{Convergence Analysis}\label{sec:outerloop}




Let $\region$ be the current belief region of the search process.
We assume that $\region$ is a unit hypercube centered at the origin. Here, we constrain our analysis to hypercubes, nevertheless, the idea of the algorithm applies to arbitrary regions.
Let \(\slack\) be a hypercube centered at the origin with edge length \(\frac{3}{2}\).
Let $\mathcal{T}_{\slack, \frac{1}{4}}$ be a set of hypercubes with edge length $\frac{1}{4}$ that tile $\slack$.
We define the set of \textit{children} $\children{\region}$. 
$\children{\region}$ is the set of all hypercubes of edge length $\frac{1}{2}$ 
that can be constructed by joining tiles in $\mathcal{T}_{\slack, \frac{1}{4}}$.
Figure \ref{fig:outerloop_regions} illustrates this construction.
We see that along each axis, there are five possible positions for a child, which give us a total of \(5^\embdim\) children.
The hypercube of edge length $4$, centered at the origin contains all regions $\region'$ for which $\region \in \children{\region'}$. It is the union of all direct ancestors of $\region$, for convenience we will refer to it as the \textit{parent} $\parent{\region}$ of $\region$. In our algorithm, backtracking from $\region$ leads to $\parent{\region}$.




\newcommand*{\xMin}{0}%
\newcommand*{\xMax}{6}%
\newcommand*{\yMin}{0}%
\newcommand*{\yMax}{6}%


\newcommand*{\cMin}{-4}%
\newcommand*{\cMax}{3}%

\begin{figure}[t]
\centerline{
  \resizebox{0.7\columnwidth}{!}{%
  \begin{tikzpicture}    %
    \foreach \i in {\xMin,...,\xMax} {
        \draw [very thin,SNSGRAY] (\i,\yMin) -- (\i,\yMax)  node [below] at (\i,\yMin) {};
    }
    \foreach \i in {\yMin,...,\yMax} {
        \draw [very thin,SNSGRAY] (\xMin,\i) -- (\xMax,\i) node [left] at (\xMin,\i) {};
    }


    \draw[line width=3mm, SNSGREEN] (\xMin+1,\yMin+1) rectangle ++(4,4);

    \draw[line width=3mm, SNSLIGHTBLUE] (3, 3) rectangle ++(2, 2);

    \draw[line width=3mm, SNSLIGHTBLUE] (4, 4) rectangle ++(2, 2);

    \draw[line width=3mm, SNSBLUE] (-5, -5) rectangle ++(16, 16);

    \node (REGION_label) at (3, -1) [scale=1.5]{\Huge current region \(\region\)};
    \node (REGION_target) at (3,1) {};
    \draw[line width=1mm, ->] (REGION_label.north) -- (REGION_target);

    \node (PARENT_label) at (3, -3.5) [scale=1.5]{\Huge parent \(\parent{\region}\)};
    \node (PARENT_target) at (3, -5) {};
    \draw[line width=1mm, ->] (PARENT_label.south) -- (PARENT_target);

    \node (CHILD_label) at (4, 9) [scale=1.5]{\Huge child regions $\in \children{\region}$};
    \node (CHILD_target1) at (5,5) {};
    \draw[line width=1mm, ->] (CHILD_label.south) ++(-0.2,0)  -- (CHILD_target1);

    \draw[line width=1mm, |-|] (-4, -4.8)  -- (-4,10.8);
    \node at (-3.5, 3) [scale=1.7]{\Huge 4};
  
    \draw[line width=1mm, |-|] (0.2, 1)  -- (0.2,5);
    \node at (-0.5, 3) [scale=1.7]{\Huge 1};

    \draw[line width=1mm, |-|] (6.5, 4)  -- (6.5,6);
    \node at (7.3, 5) [scale=1.7]{\Huge $\frac{1}{2}$};

\end{tikzpicture}
}%
}
\caption{$\region$, two children and parent region.}
\label{fig:outerloop_regions}
\end{figure}







If \(\targetx \in \region\) then we call \(\region\) green (correct), otherwise red (incorrect).
A green region must have at least one green child. The parent of a green region must be green.


At every stage, our algorithm collects query replies, until it makes a decision to proceed to one of the child regions or to backtrack to the parent.
Similarly to the classification of regions, we distinguish between correct and incorrect decisions.
Proceeding to a green node is correct, whereas backtracking from a green node is incorrect.
Proceeding to a red node is incorrect, whereas backtracking from a red node is correct.
The probability of all these events depends on the current region $\region$ and the target $\targetx$. 
Table \ref{green-table} shows a comprehensive listing.
We refer to the probabilities associated with a correct transition with $p$ and incorrect transitions with $q$.
For a green region, there are two incorrect decisions: backtracking and straying by proceeding to a red child. They are named $\puprt$ and $\pstray$. The correct decision is to proceed to a green child, it is named $\pdnrt$.
For a red region, there are two correct decisions: backtracking and recovering by proceeding to a green child. They are named $\pupwr$ and $\precover$. The incorrect decision is to proceed to a red child, it is named $\pdnwr$. 
For the analysis of a random walk on colored regions we need the total probability of a correct or incorrect decision. This is also shown in Table \ref{green-table}
An illustration with nested regions and the corresponding transitions is shown in Figure \ref{tikz:illustration}.




\begin{table}[t]
\caption{Transition probabilities.}
\label{green-table}
\begin{center}
\begin{small}
\begin{sc}
\begin{tabular}{lr}
\toprule
\multicolumn{2}{c}{$\region$ is green:} \\
Transition to & probability \\
\midrule   
parent &$\puprt(\region, \targetx)$ \\
green child & $\pdnrt(\region, \targetx)$ \\
red child &  $\pstray(\region, \targetx)$\\
\midrule   
Correct & $\pright(\region, \targetx) = \pdnrt(\region, \targetx)$\\
incorrect &  $\pwrong(\region, \targetx) = \puprt(\region, \targetx) + \pstray(\region, \targetx)$\\
\bottomrule
\end{tabular}
\vskip 0.15in
\begin{tabular}{lr}
\toprule
\multicolumn{2}{c}{$\region$ is red:} \\
Transition to & probability \\
\midrule   
parent &$\pupwr(\region, \targetx)$ \\
red child & $\pdnwr(\region, \targetx)$ \\
green child &  $\precover(\region, \targetx)$\\
\midrule   
Correct & $\pright(\region, \targetx) = \pupwr(\region, \targetx) + \precover(\region, \targetx)$\\
incorrect &  $\pwrong(\region, \targetx) = \pdnwr(\region, \targetx)$\\
\bottomrule
\end{tabular}
\end{sc}
\end{small}
\end{center}
\end{table}

\begin{figure*}[ht]
  \begin{center}
  \resizebox{0.9\textwidth}{!}{%
  \tikzset{
    font=\Large}
  \begin{tikzpicture}[
    roundnode/.style={rounded rectangle, draw=SNSGREEN,pattern=crosshatch dots, pattern color=SNSGREEN!10,, ultra thick, minimum width=25mm, minimum height=15mm},
    squarednode/.style={rounded rectangle, dashed, draw=SNSRED, pattern=north west lines, pattern color=SNSRED!20, ultra thick, minimum size=15mm},
    ]

    \coordinate (right) at (5,0);
    \coordinate (left) at (-3,0);


    \filldraw[color=SNSGREEN, pattern=crosshatch dots, pattern color=SNSGREEN!30, ultra thick](-6,-7) rectangle ++(6,6);
    \filldraw[color=SNSRED, dashed, pattern=north west lines, pattern color=SNSRED!50, ultra thick](-5.9,-6.9) rectangle ++(2.95,2.95);
    \filldraw[color=SNSGREEN, pattern=crosshatch dots, pattern color=SNSGREEN!30,  ultra thick](-3,-4) rectangle ++(2.9,2.9);

    \filldraw[color=SNSRED, dashed, pattern=north west lines, pattern color=SNSRED!50, ultra thick](-3.75,-4.75) rectangle ++(1.5, 1.5);
    \filldraw[color=SNSGREEN, pattern=crosshatch dots, pattern color=SNSGREEN!30, ultra thick](-2.675,-3.64) rectangle ++(0.85, 0.85);

    \filldraw[color=blue!60, fill=blue!100, very thick](-2, -3) circle (0.1);
    
    \node[roundnode]        (X1NODE)       [below=of right] {\(\region_1\): $\z{\region_1}{\targetx} = 0$};
    \coordinate[right=0.5cm of X1NODE.north] (X1NODE_nw);
    \coordinate[left=0.5cm of X1NODE.north] (X1NODE_ne);
    \coordinate[right=0.5cm of X1NODE.south] (X1NODE_sw);
    \coordinate[left=0.5cm of X1NODE.south] (X1NODE_se);
    
    \node[squarednode]        (X2NODE)       [below=of X1NODE] {\(\region_2\): $\z{\region_2}{\targetx} = 1$};
    \coordinate[right=0.5cm of X2NODE.north] (X2NODE_nw);
    \coordinate[left=0.5cm of X2NODE.north] (X2NODE_ne);
    \coordinate[right=0.5cm of X2NODE.south] (X2NODE_sw);
    \coordinate[left=0.5cm of X2NODE.south] (X2NODE_se);

    \node[roundnode]        (X5NODE)       [below=of X2NODE] {\(\region_5\): $\z{\region_5}{\targetx} = 0$};
    \coordinate[right=0.5cm of X5NODE.north] (X5NODE_nw);
    \coordinate[left=0.5cm of X5NODE.north] (X5NODE_ne);
    \coordinate[right=0.5cm of X5NODE.south] (X5NODE_sw);
    \coordinate[left=0.5cm of X5NODE.south] (X5NODE_se);

    \node[squarednode]        (X3NODE)       [right=2.5cmof X2NODE] {\(\region_3\): $\z{\region_3}{\targetx} = 1$};
    \coordinate[right=0.5cm of X3NODE.north] (X3NODE_nw);
    \coordinate[left=0.5cm of X3NODE.north] (X3NODE_ne);
    \coordinate[right=0.5cm of X3NODE.south] (X3NODE_sw);
    \coordinate[left=0.5cm of X3NODE.south] (X3NODE_se);
    
    \coordinate[above=0.5cm of X1NODE.east] (X1NODE_en);
    \coordinate[below=0.5cm of X1NODE.east] (X1NODE_es);
    
    \node[roundnode]      (X4NODE)       [right=2.5cm of X1NODE]  {\(\region_4\): $\z{\region_4}{\targetx} = 1$};
    \coordinate[above=0.5cm of X4NODE.east] (X4NODE_en);
    \coordinate[below=0.5cm of X4NODE.east] (X4NODE_es);
    \coordinate[above=0.5cm of X4NODE.west] (X4NODE_wn);
    \coordinate[below=0.5cm of X4NODE.west] (X4NODE_ws);
    
     
    \coordinate (top) at (0.7,-0.7);
    \coordinate (bottom) at (0.7,-7.5);
    \path (top) edge [ultra thick, dashed, gray!50] node {} (bottom);
    
    \node (R1_label) at (-5.4, -1.7) {\huge\(X_1\)};
    \node (R2_label) at (-5.3, -4.5) {\huge\(X_2\)};
    \node (R3_label) at (-3.4, -3.6) {\huge\(X_3\)};
    \node[] (R1E2_label) at (-2.27, -3.3) {\huge \(X_4\)};
    \node[] (R1E2_label) at (-2.4, -1.75) {\huge\(X_5\)};

    \draw[thick, ->] (-1.3, -2.5) -- (-1.9, -2.9);
    \node at (-0.9, -2.3) {\huge $\targetx$};


    \draw[ultra thick, ->] (X1NODE.south) -- (X2NODE.north) node [xshift=-3pt, yshift=10pt, anchor=east] {$q_s(\region_1, \targetx)$};
    \draw[ultra thick, ->] (X2NODE.east) -- (X3NODE.west) node [xshift=-35pt, anchor=north] {$q_d(\region_2, \targetx)$};
    \draw[ultra thick, ->] (X3NODE.north) -- (X4NODE.south) node [yshift=-10pt, anchor=west] {$p_r(\region_3, \targetx)$};
    \draw[ultra thick, ->] (X3NODE.north west) -- (X1NODE.south east) node [xshift=32pt, anchor=west] {$p_u(\region_3, \targetx)$};
    \path[->] (X1NODE.west) edge [bend right=90, ultra thick] node[xshift=10pt, yshift=-40pt, anchor=west] {$p_d(\region_1, \targetx)$} (X5NODE.west);

    \end{tikzpicture}
  }%
  \caption{Regions and target (blue dot) on the left side, selected transitions on the right side.} 
  \label{tikz:illustration}
    \end{center}
  \end{figure*}%




\begin{lemma}\label{lemma:region_rw}
  The sequence of regions $\region_\searchstep$ visited in each stage $\searchstep$ of the search process forms a random walk.
\end{lemma}

Intuitively, we need the probability of making a correct decision to be strictly higher than the probability of an incorrect decision. 
This is formalized in the following definition:
Let \(\decisionbias>0\) be a constant, such that for any \(\region \subset \Omega\) that can be visited by our algorithm, and any \(\targetx\):
\begin{assumption}
  \label{assump:bias}
   $\pright(\region, \targetx) - \pwrong(\region, \targetx)> \decisionbias$.
  \end{assumption}
  \begin{assumption}\label{assump:depth_bias}
    $\targetx \in \region \implies  \pdnrt(\region, \targetx) - 2\puprt(\region, \targetx) - \pstray(\region, \targetx) \frac{b+1}{2b} >0$.
  \end{assumption}

Assumption \ref{assump:depth_bias} is designed to facilitate the proof of Theorem \ref{theorem:exp_v2}. 

In practice, it is simple to tune the confidence with which the search makes its decisions: Collecting more queries before committing to a decision decreases the chance of making a mistake.
The next theorem asserts that, with access to a scale-free oracle, it is always possible to satisfy Assumptions \ref{assump:bias} and \ref{assump:depth_bias}.
We constructively prove Theorem \ref{theorem:innerloop_existence} by presenting Algorithm \ref{alg:inner_loop} in Section \ref{sec:inner_loop}.
\begin{theorem}\label{theorem:innerloop_existence}
  For any $\decisionbias$ and any $\region$, there is an algorithm that needs to observe, at most, a constant and finite number of replies from a $\gamma$-CKL oracle,
  until it can make a decision with probabilities that satisfy Assumptions \ref{assump:bias} and \ref{assump:depth_bias}.
\end{theorem}

We keep track of the number of incorrect decisions.
Let $\z{\region}{\targetx}$ be the number of backtracking decisions that are needed to reach a green region from $\region$.
If the search proceeds to a red child, $\z{\region}{\targetx}$ is either increased by 1, or stays unchanged (it is possible that no additional backtracking is required). Recovering, by proceeding to a green child, means immediately setting $\z{\region}{\targetx}$ to 0.



    From Assumption \ref{assump:bias} we get
    $\targetx \in \region \implies \puprt(\region, \targetx) + \pstray(\region, \targetx) < \frac{1-b}{2}$
    and $\targetx \notin \region \implies \pdnwr(\region, \targetx) < \frac{1-b}{2}$.
    We construct a time-homogenous random walk that will serve as a stochastic upper bound for $\z{\region_\searchstep}{\targetx}$.
    Let $\rwbd_\searchstep$ be a random walk on natural numbers, starting at $\rwbd_0 = \z{\region_0}{\targetx}$.
    At each step, $\rwbd_\searchstep$ is incremented with probability $\frac{1-b}{2}$ and decremented with probability $\frac{1+b}{2}$.
    Once $\rwbd_\searchstep$ reaches 0, there is a self loop of probability $\frac{1+\decisionbias}{2}$ and a transition to $1$ with probability $\frac{1-\decisionbias}{2}$.

    \begin{lemma}\label{lemma:stoch_upper_bound}
      Given a stochastic decision criterion that satisfies Assumption \ref{assump:bias}, 
      $\rwbd_\searchstep$ is a stochastic upper bound for $\z{\region_\searchstep}{\targetx}$, we denote this by $\z{\region_\searchstep}{\targetx} \preceq_{st.} \rwbd_\searchstep$
    \end{lemma}
    \begin{proof}[Proof sketch]
      We construct a coupling between the random walk $\tilde{\region}_\searchstep$ and a random variable $\tilde{Z}$.
      We then use induction to show that with probability 1 it holds that $\tilde{Z} > z(\targetx, \tilde{\region}_\searchstep)$.
    \end{proof}


\begin{lemma} \label{lemma:number_of_errors}
  Given a stochastic decision criterion that satisfies Assumption \ref{assump:bias}, for any $k > 0$
  \begin{align*}
  \mathbf{P}[\z{\region_s}{\targetx} > k] &\le \left(\frac{1-b}{1+b}\right)^k.
  \end{align*}
\end{lemma}



Let $\tau_{\region} = \inf \{s>0 \mid \targetx \in \region_\searchstep, \region_0 = \region\}$ be the stopping time of reaching a green region, starting from $\region$.
\begin{lemma}. \label{lemma:stray_time}
  Let $\region$ be red and $\parent{\region}$ be green (this occurs after just having strayed from a green region).
  Given a stochastic decision criterion that satisfies Assumption \ref{assump:bias}, it holds that $\mathbb{E}[\tau_{\region}] \leq \frac{1}{b}$.
\end{lemma}
\begin{proof}[Proof sketch]
The proof relies on $Z_\searchstep$ as a stochastic upper bound.
We first use the Ergodic Theorem to prove the existence of a unique stationary distribution.
We then explicitly calculate this stationary distribution and use it to derive recurrence times.
This enables us to prove an upper bound on the expected stopping time.
\end{proof}

To quantify the progress of our search, we keep track of the \textit{depth} $\depth{\region}$ of a region.
The depth is the number of consecutive proceed decisions needed to reach this region, starting from $\Omega$.
The edge length of a region $\region$ at depth $\depth{\region}$ is $\left(\frac{1}{2}\right)^{\depth{\region}}$.
The $k$-th ancestor $u(\region, k)$ is reached by backtracking $k$ times from $\region$.
With the following theorem we show the exponential rate of convergence of our algorithm. At every stage $\searchstep$ of the algorithm, $u(\region, k)$ contains the target with high probability (which doesn't depend on $\searchstep$) and its depth increases at a linear rate.

\begin{theorem}\label{theorem:exp_v2}
Given a subroutine that satisfies Assumptions \ref{assump:bias} and \ref{assump:depth_bias},
for any desired probability of error $\delta$,
there are two constants $k>0$ and $C>0$ such that
\begin{align*}
\mathbf{P}\left[ \targetx \in u(\region_s, k) \right] &>1-\delta,
&\mathbf{E}\left[  \depth{u(\region_s, k)} \right] &> C s.
\end{align*}
\end{theorem} 
\begin{proof}[Proof sketch]
We define a stopping time of arriving at a green region after leaving a green region.
Using the results of Lemma \ref{lemma:stray_time}, we prove an upper bound for the expectation of this stopping time.
Using Assumption \ref{assump:depth_bias}, we show that the expected depth of each consecutive green region increases linearly.
Together with Lemma \ref{lemma:number_of_errors}, the statement follows.
\end{proof}

\subsection{A Scale-Free Decision Criterion} \label{sec:inner_loop}


In each stage, we need a querying scheme that asks at most a constant number of queries, until it arrives at a decision.
The probability of error needs to satisfy Assumptions \ref{assump:bias} and \ref{assump:depth_bias}.

Our scheme is based on a test for the hypothesis (H) "\textit{$\targetx$ is in the region $\region$}".
As $\targetx$ approaches the boundary of $\region$, it becomes increasingly hard to distinguish whether the point is inside or outside.
This leads to a region of uncertainty $U$ around $\region$ in which our hypothesis test is not reliable.


Let $\region$ be a hypercube of edge length 2, centered at the origin and let
$U$ be a hypersphere with radius $\radiusU > 1$, also centered at the origin.
Everything outside of $U$ is $F = \Omega \setminus U$. We will construct a query $Q$ and calculate the corresponding $r_u$ such that repeatedly observing the outcome of $Q$
enables us, with probability $>1-\delta$, to accept (H) if $\targetx \in \region$, or to reject (H) if $\targetx \in F$.

\begin{lemma}\label{lemma:hyptest}
  We assume $\embdim > 1$. %
  Let $Q = (\origin, (1+d)\mathbf{e})$, where $\mathbf{e} = (1, 0, 0, \dots) $ is a unit vector along an arbitrarily chosen axis.
  Let $r_u = 1+ \frac{\embdim + \sqrt{\embdim^{3} + \embdim^{2} - \embdim}}{\embdim - 1}$.
  Let $\region, \uncertain, \far$ be defined as above.
  Then for any delta $\delta >0$ observing a constant number of query outcomes is enough to apply a one-tailed binomial hypothesis test which will with probability $1-\delta$: accept (H), if $\targetx \in \region$, or reject (H), if $\targetx \in \far$.
  The necessary number of observations does not depend on $\region$ and $\targetx$.
\end{lemma}


We need to find out whether a child of the current belief region contains the target.
Due to the region of uncertainty, we cannot apply the hypothesis test directly to the child regions.
Instead, we construct a finer discretization grid.


In Lemma \ref{lemma:hyptest}, we assume a region of edge length 2.
As our oracle model is scale-free, we can apply the hypothesis test to a smaller region, which results in a smaller uncertainty region as well.
For a region with edge length $r_c$, the radius of the uncertain region is scaled by $r_c/2$.
Let $r_c < \frac{1}{8\radiusU}$, which leads to an uncertain region with radius $\radiusU \frac{r_c}{2} <\frac{1}{16}$. 



Let \(\tiling{\slack, r_c}\) be a tiling of \(\slack\) with hypercubes of edge length \(r_c\), we refer to the cells in this tiling by $\cell{k}, k=1..K$,
the respective centers are $x_{\cell{k}}$.
If the edge length of $\slack$ is not divisible by $r_c$, it is always possible to pick a smaller value for $r_c$.
Each cell $\cell{k}$ in the tiling belongs to one of these classes:
\begin{itemize}
  \item (A) \(\targetx \in \cell{k}\). When using the hypothesis test, with high probability, our test will not reject (H). We assume that cells include their border. If the target happens to lie exactly on the boundary between cells, then all of them belong to class (A).
  \item (B) \(\targetx \notin \cell{k} \land ||\targetx - \mathbf{x}_{\cell{k}}||< \frac{1}{16}\). When using the hypothesis test, the target lies in the uncertain region. We do not make any assumption about whether (H) is rejected or not.
  \item (C) \(\targetx \notin \cell{k} \land ||\targetx - \mathbf{x}_{\cell{k}}||\geq\frac{1}{16}\). When using the hypothesis test,  with high probability, our test will reject hypothesis (H).
\end{itemize}

When using the hypothesis test, we know that, with high probability, all cells in class (C) are rejected.
The remaining cells fit in a small bounding box. 
This is illustrated in Figure \ref{fig:nested_grid}.


\begin{figure}[t]
\centerline{

\resizebox{0.7\columnwidth}{!}{%
\tikzset{
font=\Huge}
\begin{tikzpicture}


    \foreach \i in {\cMin,...,\cMax} {
        \draw [line width=3mm,SNSRED] (\i,\cMin) -- (\i,\cMax)  node [below] at (\i,\cMin) {};
    }
    \foreach \i in {\cMin,...,\cMax} {
        \draw [line width=3mm,SNSRED] (\cMin,\i) -- (\cMax,\i) node [left] at (\cMin,\i) {};
    }
    
    \draw[line width=3mm,SNSRED] (\cMin,\cMin) rectangle (\cMax,\cMax);
    \draw[line width=3mm, SNSLIGHTBLUE] (-3,-1) rectangle ++(5, 1);
    \draw[line width=3mm, SNSLIGHTBLUE] (-1,-3) rectangle ++(1,5);
    \draw[line width=3mm, SNSLIGHTBLUE] (-2,-2) rectangle ++(3, 3);
    
    
    \draw[line width=3mm, SNSGREEN, pattern=north west lines, pattern color=SNSGREEN] (-1,-1) rectangle ++(1, 1);


    \foreach \i in {-9, -3, 3} {
      \foreach \j in {-9, -3, 3}{
        \draw[line width=1mm,SNSGRAY] (\i-0.5, \j-0.5) rectangle ++(6,6);
      }
    }

    \node (A_label) at (-0.5, 7.5) [SNSGREEN, scale=2] {\Huge A};
    \node (A_target) at (-0.5,-0.5) {};
    \draw[line width=1mm, ->] (A_label.south) -- (A_target);

    \node (B_label) at (0.5, 6) [SNSLIGHTBLUE, scale=2]{\Huge B};
    \node (B_target) at (0.5,0.5) {};
    \draw[line width=1mm, ->] (B_label.south) -- (B_target);

    \node (B_label) at (1.5, 4.5)[SNSRED, scale=2] {\Huge C};
    \node (B_target) at (1.5,1.5) {};
    \draw[line width=1mm, ->] (B_label.south) -- (B_target);

    \draw[line width=1mm, |-|] (-7, -9.4)  -- (-7,-3.4);
    \node at (-6, -6.5) [scale=2] {\Huge $\frac{1}{4}$};

\end{tikzpicture}
}%
}
\caption{Layout of the nested grid. The target lies in cell A.}
\label{fig:nested_grid}
\end{figure}


\begin{lemma}\label{lemma:bbox}
  There is a hypercube $\boundingbox$ with an edge length of less than $\frac{1}{4}$, such that all cells in the classes (A) and (B) are fully contained in $\boundingbox$.
\end{lemma}


We apply the hypothesis test to all cells in \(\tiling{\slack, r_c}\).
Let $\boundingbox$ be the bounding box containing all cells for which (H) was not rejected.
If $\boundingbox$ does not overlap with $\region$, we backtrack. Otherwise, if there is a child region that fully contains $\boundingbox$, we proceed to it.
This mechanism is formalized in Algorithm \ref{alg:inner_loop}.
The following Theorem \ref{theorem:innerloop_main} shows that this algorithm enables us to make decisions that lead to an exponential rate of convergence, i.e., they satisfy Assumptions \ref{assump:bias} and \ref{assump:depth_bias}.
\begin{theorem}\label{theorem:innerloop_main}
  Algorithm \ref{alg:inner_loop} can achieve any desired probability of error $\hat{\delta}$, while requiring only a finite number of queries.
  In particular, choosing $\hat{\delta}$ small enough ensures that the scheme is compatible with Assumptions \ref{assump:bias} and \ref{assump:depth_bias}.
\end{theorem}


\begin{algorithm}[t]
  \begin{algorithmic}
  \STATE Set up a discretization \(\tiling{\slack, r_c}\).
  \STATE $H \gets \emptyset$
  \STATE
  \STATE This loop replaces the nextQuery subroutine from Algorithm \ref{alg:outer_inner_loop}
  \FOR{$c_k$ in \(\tiling{\slack, r_c}\)}
    \STATE perform the hyp. test from Lemma \ref{lemma:hyptest} for $c_k$
    \IF{hypothesis is not rejected}
      \STATE $H \gets H \cup c_k$
    \ENDIF
  \ENDFOR
  \STATE
  \STATE This criterion corresponds to decisionReady from Algorithm \ref{alg:outer_inner_loop}
  \IF{$\exists \hat{X} \in \children{\region_\searchstep} : H \subseteq \hat{X}$}
    \STATE proceed to $\hat{X}$
  \ELSE
    \STATE backtrack to parent $\parent{\region_\searchstep}$
  \ENDIF
  \end{algorithmic}
  \caption[]{Search with hypothesis test criterion}\label{alg:inner_loop}
  \end{algorithm}

\subsection{Implementation}\label{sec:implementation}
In practice, it is not efficient to conduct a series of independent hypothesis tests.
A real-world implementation should, instead, rely on numerical integration.
Within each stage, the algorithm collects oracle replies and updates an approximation of the posterior distribution of the target location, until a decision can be made.
The \textit{nextQuery} subroutine in Algorithm \ref{alg:outer_inner_loop} then corresponds to a random sample based on the current belief region. As more and more evidence from queries is collected, the posterior distribution will either concentrate in a child region, or show that the target is likely not in the current belief region. We can define a confidence threshold $\alpha$:
The algorithm proceeds if there is a subregion $\hat{X}$ with $\int_{\hat{X}}p(\targetx | Q,y) > \alpha$ and backtracks if $\int_{\region}p(\targetx | Q, y)< 1-\alpha$, this prescribes a criterion for \textit{decisionReady} in Algorithm \ref{alg:outer_inner_loop}. Please refer to the supplementary material for an implementation in Python and to Section \ref{sec:synthetic_exp} for a benchmark of our algorithm based on synthetic data.
