\section{Experimental Results}
While our primary contributions lie in the formal analysis of the proposed multi‐objective bandit framework and its regret guarantees, we also provide an empirical assessment to illustrate the algorithm’s practical behavior.  All code was written in Python and executed on a laptop with an Intel Core i7-1165G7 CPU (2.80 GHz) and 16 GB of RAM.  

For each problem size \((n,D,T)\), we generate a single set of “true” mean vectors \(\{\mu_{a,d}\}_{a=1,\dots,n}^{d=1,\dots,D}\) by sampling uniformly from \([0,1]\).  During the rounds, we sample stochastic outcomes
\[
R_{a,d}^t\sim\mathrm{Bernoulli}(\mu_{a,d}),
\]
for \(
~a=1,\dots,n, \; d=1,\dots,D, \; t=1,\dots,T
\). At round $t=T'$, we run both the exact and greedy set‐cover subroutines.  This ensures that both methods operate on identical empirical data. 
As explained, the exact set cover algorithm exhaustively searches all subsets of candidate sets to be set $B$ in algorithm \ref{SMOMABA} as the smallest set of arms whose union covers all arms. It explores combinations of increasing size, ensuring optimality but requiring exponential time. The greedy approach iteratively selects the set that covers the largest number of uncovered arms until full coverage is achieved. It runs in polynomial time and guarantees optimality with a logarithmic approximation \citet{vazirani2003approximation}.
%


We conduct experiments for \(n \in \{20, 50, 100\}\) and \(D \in \{2, 3, 5\}\), and we choose the total horizon \(T\) large enough so that the confidence radius is approximately 0.02.
In theory, \(r = \sqrt{2\ln T / T'}\) ensures that adding \(2r\) to each empirical mean yields valid dominance relations, but it can easily produce \(r\) values near unity—collapsing the cover set, $B$, to size one. To achieve a practically meaningful margin (e.g.\ \(r\approx0.02\)), \(T'\) (and thus \(T\)) must grow by orders of magnitude. Consequently, one typically calibrates \(r\) below its theoretical bound, accepting a slight relaxation of the guarantee in favor of a non-degenerate arm selection. For example, choosing $T=10^8$ yields $T'\approx 10^5$ and $r\approx0.02$. 
Each configuration is repeated independently 10 times to reduce sampling noise.  We measure (i) the average true PO size, (ii) the average cover size \(\lvert B\rvert\) in the algorithm, and (iii) the average wall‐clock time for each case approximation and exact approach to compute the minimum set cover, reporting these values in Table~\ref{result_table}.


\begin{table}[h!]
    \centering
    \caption{Average Pareto‐optimal set size, cover size \(|B|\), and runtime for exact and greedy set‐cover subroutines. All the instances are reported for $T=10^8$ rounds.}
    \label{result_table}
    \begin{tabular}{cc|c|cc|cc}
        \hline
        \multirow{2}{*}{$n$} & \multirow{2}{*}{$D$} & \multirow{2}{*}{Avg.\ true PO} 
            & \multicolumn{2}{c|}{Exact set cover} 
            & \multicolumn{2}{c}{Greedy set cover} \\
        & & 
            & \(|B|\) & Time (s) 
            & \(|B|\) & Time (s) \\
        \hline
        20  & 2  &  4.3 & 3.2 & 0.0 & 3.2 & 0.0 \\
        20  & 3  &  7.2 & 4.8 & 0.1 & 4.9 & 0.0 \\
        20  & 5 &  14.2 & 12.6 & 0.6 & 12.7 & 0.0 \\
        \hline
        50  & 2  &  4.8 & 2.6 & 0.0 & 2.6 & 0.0 \\
        50  & 3  &  10.5 & 6.0 & 0.2 & 6.1 & 0.0 \\
        50  & 5 &  18.2 & 11.2 & 6.1 & 11.5 & 0.1 \\
        \hline
        100 & 2  &  5.6 & 2.2 & 0.0 & 2.2 & 0.0 \\
        100 & 3  &  14.1 & 7.2 & 4.95 & 7.5 & 0.0 \\
        100 & 5 &  26.7 & 15.1 & 29.0 & 15.7 & 0.2 \\
        \hline
    \end{tabular}
\end{table}


Table~\ref{result_table} shows that as the number of arms \(n\) or the number of objectives \(D\) increases, the average Pareto‐optimal set grows substantially (from roughly 4–5 arms at \(D=2\) to over 27 arms at \(n=100, D=5\)). Both exact and greedy set‐cover methods consistently produce cover sets \(\lvert B\rvert\) that are somewhat smaller than the true Pareto front, demonstrating effective reduction of candidate arms while preserving coverage. The greedy heuristic typically selects a cover of size within one arm of the exact solver (e.g., 7.5 vs. 7.2 at \(n=100,D=3\)), at a fraction of the runtime (orders of milliseconds versus seconds when \(D\) and \(n\) grow). These results confirm that the greedy approximation achieves near‐optimal cover sizes in practice, with dramatic savings in computation time as problem dimensions scale.


Another notable feature of the algorithm is its economy in the exploitation phase: it identifies a small arm‐set \(B\) to pull after exploration. For instance, with \(n=100\), \(D=5\), and \(T=10^8\), the procedure dedicates only \(T' = 97\,295\) rounds to exploration, then confines all remaining pulls to \(\lvert B\rvert = 15\) arms—out of 27 true Pareto‐optimal candidates. To illustrate this behavior for \(D=2\), Figure~\ref{fig:D2} presents a representative trial (for \(n=20\), 50, and 100), plotting true means (black circles), PO arms (red stars), and the final cover set \(B\) (blue circles). For example, in the right panel (\(n=100\)), although 8 of the 100 arms are Pareto‐optimal, the algorithm selects only 3 for pulling in the exploitation phase. By adding \(2r\) to the empirical means of these three arms, the reduced cover set suffices to dominate all the other arms in the instance.

\begin{figure}[h!]
  \centering

  \subfloat[$n=20$]{%
    \includegraphics[width=0.31\linewidth]{Figs/Fig_n20_d2.png}%
    \label{fig:n20_d2}
  }
  \hfill
  \subfloat[$n=50$]{%
    \includegraphics[width=0.31\linewidth]{Figs/Fig_n50_d2.png}%
    \label{fig:n50_d2}
  }
  \hfill
  \subfloat[$n=100$]{%
    \includegraphics[width=0.31\linewidth]{Figs/Fig_n100_d2.png}%
    \label{fig:n100_d2}
  }

  \caption{Empirical reward distributions and selected cover sets for three problem sizes with $D=2$. The axes represent the reward values in each dimension. Black circles indicate the true mean rewards of arms, red stars denote the PO arms, and blue circles highlight the arms selected by the algorithm to form set \(B\).}
  \label{fig:D2}
\end{figure}




