\section{Experimental Results}
\begin{figure*}[tbh]
        \centering
        \includegraphics[trim={.2cm 0cm .4cm 0cm}, width=\textwidth]{figures/summary_regrets.pdf}
    \caption{Experimental results. In each plot, the $x$-axis denotes the number of function evaluations.
    The curves show the $\globalf(\instance^t)$ values averaged over at least ten independent trials. The shaded area denotes the standard error.
    The observation perturbation is sampled from $\normal{(0, 0.01)}$, while the simple regrets shown in the figures do not count the noise. \reviseFx{We also include additional results on multi-player settings in \appref{sec:additional_results}.}}
    \label{fig: exps}
\end{figure*}

We compare the proposed algorithm \algname with the following baselines.
(1) \algnameglobal removes the ROI identification of \algname and maximizes the proposed acquisition function globally as discussed in \thmref{thm: simReg}. The comparison serves as an ablation study demonstrating that the introduction of ROI allows \algname the trade-off of exploration and exploitation rather than pure exploration.
(2) We employ \algprediction and \algepsilongreedy from \cite{al2018approximating} with $\epsilon=0.1$. \algprediction corresponds to their method using approximated regret as the acquisition function, a pure exploitation subroutine of \algepsilongreedy. Meanwhile, \algepsilongreedy achieves the trade-off of exploration and exploitation. The hyper-parameter $\epsilon$ controls the probability of exploration achieved by uncertainty reduction. (3) We compare with \algsur (Stepwise Uncertainty Reduction) proposed by \cite{picheny2019bayesian}, which is essentially global uncertainty reduction on multiple unknown utility functions. For efficiency, we take advantage of recent advancements in deep kernel learning \citep{wilson2016deep, zhang2022learning} and employ it in both the proposed methods and the baseline.

We examine the performance of our proposed algorithm on the following games.
\paragraph{Saddle.}
This corresponds to the running example we presented in Example \ref{example:saddle} and is also discussed by \citet{al2018approximating,picheny2019bayesian}.
\paragraph{Rock-Paper-Scissors (RPS).}
\begin{small}
\begin{table}[tbh]
\centering
 \begin{tabular}{|c |c |c |c|} 
 \hline
     & Rock & Paper & Scissors \\
 \hline
 Rock & (0, 0) & (-1, 1) & (1, -1) \\ 
 \hline
 Paper & (1, -1) & (0, 0) & (-1, 1) \\
 \hline
 Scissors & (-1, 1) & (1, -1) & (0, 0) \\
 \hline
 \end{tabular}
 \caption{Payoffs of the rock-paper-scissors game. Each utility element $(i, j)$ means the row agent receives $i$ utility and the column agent receives $j$.}
 \label{table:rps_game}
\end{table}
\end{small}
In this game, two agents' strategies are denoted by 
$x_1, x_2 \in \Delta^2 = \{x \in \rr^3: x^r + x^p + x^s =1\},$
and the utilities are defined as 
\begin{small}
\begin{equation}
    \begin{split}
        &u_1(x_1, x_2) = (x_1^p - x_1^s) x_2^r + (x_1^s - x_1^r) x_2^p + (x_1^r - x_1^p) x_2^s, \\
        &u_2(x_1, x_2) = (x_2^p - x_2^s) x_1^r + (x_2^s - x_2^r) x_1^p + (x_2^r - x_2^p) x_1^s.
    \end{split}
\end{equation}    
\end{small}
The NE is attained at $x_1 = x_2 = (1/3, 1/3, 1/3)$.

\looseness -1 \paragraph{Hotelling's Game.}
We explore another classical structured game with real-world applications \citep{brenner2005hotelling}. Imagine a market where two firms must choose their locations on a 
2-$d$ grid to attract customers. Each firm wants to attract customers, and the utility depends on the number of customers they draw. The firms have to balance being close to customers while avoiding excessive competition.
Let us consider the total area as a unit square, and each firm's action is to choose location $x = (x^N, x^W) \in [0, 1]^2$. We assume the customer population is uniformly distributed over the total area, and two firms post the same price for the products. Therefore, a customer prefers a firm that is close by. Given the two firms' actions $(x_1^N, x_1^W)$ and $(x_2^N, x_2^W)$, their utility can be computed by the area of agents whose distance is closer to themselves than the competitor. For example, let $S_1 = \{(x^N, x^W) | (x^N - x^N_1)^2 + (x^W - x^W_1)^2 \le (x^N - x^N_2)^2 + (x^W - x^W_2)^2\}$ and firm 1 utility is $S_1$'s area. 

\paragraph{Marketing Budget Allocation Game.}
Finally, we present a real-world marketing problem, where advertisers seek to maximize the number of customers by allocating given budgets to each media channel effectively \citep{maehara2015budget}. Let $G = (S\cup Z,E)$ be a bipartite graph, where the left vertices $S$ denote media channels, the right vertices $Z$ denote customers, and the edges $E \subseteq S \times Z$ denote the relations between channels and customers. Each edge $(s, z) \in E$ has an activation probability $p(s, z) \in [0, 1]$ such that customer $z \in Z$ is activated via channel $s \in S$ with probability $p(s, z)$.

There are $n$ advertisers, where each advertiser's strategy is $x_i \in \nn_{\ge0}^{|S|}$ denotes a vector of allocated units for $|S|$ channels. The strategy space for each advertiser is \[X_i = \{x_i \in \nn_{\ge0}^{|S|}: x_i(s) \le  c(s)~\forall s; \langle w, x_i\rangle \le B\},\] 
where $c(s)$ denotes the capacity of every channel and $w \in \rr_+^{|S|}$ denotes the  cost of every unit for all channels.
Let $\Sigma_n$ denote the set of all permutations of $[n]$. Finally, the utility of every advertiser $i \in [n]$ is denoted as 
\begin{equation}
    u_i(\x) = \frac{1}{n!} \sum_{z \in Z} \sum_{\sigma \in \Sigma_n} P_i(x_i, z) \prod_{j \prec_{\sigma} i}\big(1-P_j(x_j, z)\big)
\end{equation}
where $P_i(x_i, z) = 1- \prod_{s \in S} (1-p(s,z))^{x_i(s)}$ denotes the probability of customer $z$ being activated by advertiser $i$ under the units allocation plan $x_i$. In the experiment,
we set $n=2$, $|S|=4$ and $|Z|=12.$

\paragraph{Discussion.}
As is shown in \figref{fig: exps}, \algname consistently matches or outperforms the baselines. The comparison with \algnameglobal shows that the introduced ROI identification significantly contributes to the general performance. Though implemented differently, \algnameglobal and \algsur both lack exploitation. Their simple regrets platform at high values in \figref{fig: exps} (b), (c), and (d) indicate the intrinsic complexity of the corresponding problems. \algepsilongreedy outperforms \algprediction in \figref{fig: exps}(a), (c), and (d), showing the importance of the trade-off of exploration and exploitation in the learning process. \algname outperform \algepsilongreedy in \figref{fig: exps}(c) and (d) showing that in complex setting, \algname achieves a principled and more efficient trade-off.