\section{Applications and Experiments}
\label{sec:exp}
\begin{figure*}[t!]
    \centering
    %\hspace{-0.5em}
     \subfigure[delicious\_300 samples]
{\label{fig:cover2500_300_eps_q}\includegraphics[width=0.24\textwidth]{figures/cover_n2500_300_sigma_1.0_atg_eps-q}} 
%\hspace{-0.5em}
     \subfigure[delicious\_300 avg samples]
{\label{fig:cover2500_300_eps_average-q}\includegraphics[width=0.24\textwidth]{figures/cover_n2500_300_sigma_1.0_atg_eps-ave_q}}
    %\hspace{-0.5em}
     \subfigure[delicious\_300 samples]
{\label{fig:cover_300_k_q}\includegraphics[width=0.24\textwidth]{figures/cover_n2500_300_sigma_1.0_atg_k-q}}
%\hspace{-0.5em}
\subfigure[delicious\_300 avg samples]
{\label{fig:cover_300_k-ave_q}\includegraphics[width=0.24\textwidth]{figures/cover_n2500_300_sigma_1.0_atg_k-ave_q}}
%\hspace{-0.5em}
\subfigure[corel\_60 samples]
{\label{fig:corel_60_eps_q}\includegraphics[width=0.24\textwidth]{figures/corel_60_sigma_1.0_atg_eps-q}}
%\hspace{-0.5em}
\subfigure[corel\_60 average samples]
{\label{fig:corel_60_eps_ave_q}\includegraphics[width=0.24\textwidth]{figures/corel_60_sigma_1.0_atg_eps-ave_q}}
%\hspace{-0.5em}
\subfigure[corel\_60 samples]
{\label{fig:corel_60_k-q}\includegraphics[width=0.24\textwidth]{figures/corel_60_sigma_1.0_atg_k-q}}
%\hspace{-0.5em}
\subfigure[corel\_60 average samples]
{\label{fig:corel_60_k_ave_q}\includegraphics[width=0.24\textwidth]{figures/corel_60_sigma_1.0_atg_k-ave_q}}
%\hspace{-0.5em}
\subfigure[delicious samples]
{\label{fig:cover_eps_q}\includegraphics[width=0.24\textwidth]{figures/cover_n5000_sigma_1.0_atg_eps-q}}
%\hspace{-0.5em}
\subfigure[delicious samples]
{\label{fig:cover_k_q}\includegraphics[width=0.24\textwidth]{figures/cover_n5000_sigma_1.0_atg_k-q}}
%\hspace{-0.5em}
\subfigure[corel samples]
{\label{fig:corel_eps-q}\includegraphics[width=0.24\textwidth]{figures/corel_sigma_1.0_atg_eps-q}}
%\hspace{-0.5em}
\subfigure[corel samples]
{\label{fig:corel_eps_ave_q}\includegraphics[width=0.24\textwidth]{figures/corel_sigma_1.0_atg_k-q}}

\caption{The experimental results of running different algorithms on instances of data summarization on the delicious URL dataset ("delicious", "delicious\_300") and Corel5k dataset ("corel", "corel\_60").}
\label{fig:exp_results}
\end{figure*}
In this section, we conduct an experimental evaluation of our algorithm \alg on instances of \prob with noisy marginal gain evaluations. In particular, we consider instances of the noisy data summarization application, which is described in Section \ref{sec:data_summarization} in the appendix. Synthetic noise is introduced into marginal gain queries by adding a zero-mean Gaussian random variable with $\sigma=1.0$ ($\sigma$ is the standard deviation) to the exact value of marginal gain. Therefore, parameter $R=1.0$. Our experiments are conducted on a subset of the Delicious dataset of URLs that are tagged with topics \citep{soleimani2016semi}, and subsets of the Corel5k dataset of tagged images \citep{duygulu2002object}. We give more details about the datasets we use in the appendix in the supplementary material. We additionally consider the influence maximization problem in the appendix in the supplementary material. The setup of our experiments is described in Section \ref{sec:setup}, while our results are presented in Section \ref{sec:exp_results}.


\subsection{Experimental setup}
\label{sec:setup}
We now describe the setup of our experiments. In addition to our algorithm \alg, we compare the following alternative approaches to noisy \prob: (i) The fixed $\epsilon$ approximation (``\texttt{EPS-AP}'') algorithm; (ii)  Two special case of the algorithm \singla of \cite{singla2016noisy} ``\texttt{EXP-GREEDY}'' and  ``\texttt{EXP-GREEDY-K}'' with the parameter $k'$ in \singla set to be $k'=1$ and  $k'=\kappa$ respectively. More details about the three algorithms can be found in the appendix.
 We evaluate \alg and \texttt{EPS-AP} on all the datasets. However, \texttt{EXP-GREEDY} and \texttt{EXP-GREEDY-K} have greater runtime as discussed in the appendix in the supplementary material, and so we only evaluate them on the smaller datasets.
 Details about the parameter settings can be found in the appendix in the supplementary material.

\subsection{Experimental results}
\label{sec:exp_results}

We now present our experimental results. The algorithms are compared in terms of: (i) The function value $f$ of their solution; (ii) The total number of noisy samples of the marginal gain; (iii) The average number of samples per marginal gain estimation (\textit{average samples=}$\textit{total samples}/\#\textit{ of evaluated marginal gains}$).
 % The reason we include item (iii) is because \alg and \texttt{EPS-AP} are based on the threshold greedy algorithm (\threshold) of \citet{badanidiyuru2014fast} which only makes $O(n\log(\kappa/\alpha)/\alpha)$ marginal gain queries, while \texttt{EXP-GREEDY} and \texttt{EXP-GREEDY-K} are based on the standard greedy algorithm \cite{nemhauser1978analysis} which makes $O(n\kappa)$ marginal gain queries. Therefore by comparing along (iii) we normalize for this difference.
%and compare the algorithms on how many samples they have to take per marginal gain.
%The difference is that \alg and \texttt{EPS-AP} are sampling to decide whether a marginal gain is above or below a threshold, while \texttt{EXP-GREEDY} and \texttt{EXP-GREEDY-K} are sampling to decide which among the elements in $U$ have the highest marginal gains to the solution.

Our results for different values of $\epsilon$ and $\kappa$ are presented in Figure \ref{fig:exp_results}. From Figures \ref{fig:cover2500_300_eps_q}, \ref{fig:cover_300_k_q}, \ref{fig:corel_60_eps_q} and \ref{fig:corel_60_k-q}, one can see that the total samples required by \alg tends to be smaller than those required by \texttt{EPS-AP}, \texttt{EXP-GREEDY} and \texttt{EXP-GREEDY-K}, which demonstrates the advantage of \alg in sample efficiency, which was the main goal of the paper. However, on the delicious\_300 dataset (Figures \ref{fig:cover2500_300_eps_average-q} and \ref{fig:cover_300_k-ave_q}), the average samples of \texttt{EXP-GREEDY-K} is slightly better than \alg, and on the other hand \alg has significantly better average samples compared to \texttt{EXP-GREEDY-K} on the corel\_60 dataset (Figures \ref{fig:corel_60_eps_ave_q} and \ref{fig:corel_60_k_ave_q}). This demonstrates the incomparability of the instance-dependent sample query bounds given for marginal gain computations on \alg vs that of \singla.

From the results where we vary $\epsilon$, it can be seen that both the total samples and average samples of our algorithm \alg increase less compared with \texttt{EPS-AP} and \texttt{EXP-GREEDY} as $\epsilon$ decreases (Figures \ref{fig:cover2500_300_eps_q}, \ref{fig:cover2500_300_eps_average-q}, \ref{fig:corel_60_eps_q} and \ref{fig:corel_60_eps_ave_q}), which corresponds to our theoretical results (see the discussion in Section \ref{results} in the appendix).
%$O(nR^2\min\{\frac{4}{\Delta_{1}^2},\frac{1}{\epsilon^2}\}\log\big(\frac{R^2kn\min\{\frac{4}{\Delta_{1}^2},\frac{1}{\epsilon^2}\}}{\delta}\big))$. $\Delta_1$ is the the difference of the highest and second highest marginal gain. In many cases of submodular maximization, the difference is very small, therefore the result becomes $O(\frac{nR^2}{\epsilon^2}\log\big(\frac{R^2kn}{\delta\epsilon^2}\big))$. The average number of queries is depicted in Figure \ref{fig:cover2500_300_eps_average-q}, \ref{fig:cover_300_k-ave_q}, \ref{fig:corel_60_eps_ave_q} and \ref{fig:corel_60_k_ave_q} for different dataset and in terms of different $\epsilon$ and $\kappa$. From the results, we can see that number of average queries of \alg is smaller than that of \texttt{EPS-AP} and \texttt{EXP-GREEDY}, and is comparable to \texttt{EXP-GREEDY-K}.
For the experiments comparing different 
$\kappa$, we can see that the total queries of the \texttt{EXP-GREEDY} and \texttt{EXP-GREEDY-K} increases faster compared with \texttt{EPS-AP} and \alg (Figure \ref{fig:cover_300_k_q}), which can be attributed to the better dependence on $\kappa$ that \threshold exhibits compared to the standard greedy algorithm.
%which is because of the number of iterations required by the first two algorithms is $O(\kappa)$ and is higher than $O(\log \kappa)$.
A result that is a little different from the above is that the number of total queries of \texttt{EXP-GREEDY-K} decreases on dataset corel\_60 when $\kappa$ becomes large (Figure \ref{fig:corel_60_k-q}), which is because when $\kappa$ increases, \texttt{EXP-GREEDY-K} is able to better deal with tiny differences in marginal gains (see the appendix).

Finally, the results on the larger dataset (corel and delicious) of \alg and \texttt{EPS-AP} are presented in Figures \ref{fig:cover_eps_q}, \ref{fig:cover_k_q}, \ref{fig:corel_eps-q} and \ref{fig:corel_eps_ave_q}. Notably, our proposed algorithm (\alg) showcases considerable advantages over the \texttt{EPS-AP} algorithm in terms of both required total samples and average samples.
%This distinction is particularly pronounced in scenarios involving smaller values of $\epsilon$ and larger values of $\kappa$, demonstrating the effectiveness of our adaptive sampling strategy in \samp.

% At last, the results on the larger dataset (corel and delicious) comparing \alg and \texttt{EPS-AP} also demonstrate the effectiveness of our adaptive sampling strategy in \samp.



% since these two algorithms evaluate only one marginal gain at a time. At each time the two algorithms take a noisy query, only the confidence interval of the evaluated marginal gain is updated, and we only need to compare the empirical marginal gain to the threshold.  
%Second, we compare the algorithms in terms of required queries. Since the number of iterations of the algorithms for \texttt{EXP-GREEDY} and \texttt{EXP-GREEDY-K} is $\kappa$ while the number of iterations for \texttt{EPS-AP} and \alg is $O(\log \kappa)$, here we also plot the average queries to $\Delta f$ per each evaluated marginal gain to compare the efficiency of the sampling procedure per each marginal gain. The total number of queries and average queries of the algorithms for different $\epsilon$ and $\kappa$ are presented in Figure \ref{fig:exp_results}. 

