\subsection{Randomized algorithm $\mc{A}_2$} \label{sec:A2}
\begin{figure*}[ht]
\centering
\includegraphics[width=\textwidth]{Figures/W2S_LLP_Algo.pdf}
\caption{Overview of our proposed randomized algorithm for obtaining strong classifiers on original bags using a weak classifier on composite bags.}
\label{fig:w2s_llp_algorithm}
\end{figure*}

\begin{figure}[!htb]
\begin{mdframed}
\small
\textbf{Input:} : Bags $\mc{B}$, $k = \max_{(B,\sigma) \in \mc{B}} |B|$, $\alpha > 0$, $t$, oracle $\mc{O}_{kt, \alpha}$, $s \in \mathbb{Z}^+$.\\
\textbf{Steps:}
\begin{enumerate}
    \item Let $\hat{\mc{B}} = \{(\hat{B}_j,\hat{\sigma}_j)\}_{j=1}^s$ be $s$ i.i.d. samples from $\ol{D}$ (Fig. \ref{algo:DistnDbar}). 
    \item Output the classifier $\tilde{h}$ given by  $\mc{O}_{kt,\alpha}(\hat{\mc{B}})$.
\end{enumerate}
\end{mdframed}
\caption{Algorithm $\mc{A}_2$.}\label{algo:A2}
\end{figure}
Figure \ref{algo:A2} provides the algorithm $\mc{A}_2$.
Fix any $h$ that has accuracy $< (1- \eps)$ on $\mc{B}$. Then, by Lemma \ref{lem:errorampl}, and our setting of $t$ we obtain that $\Pr_{(\hat{B}, \hat{\sigma})\leftarrow \ol{D}}[(\hat{B}, \hat{\sigma})\tn{ satisfied by } h] \leq \alpha/2$. Therefore, in Step 1 of $\mc{A}_2$ it is easy to see by monotonicity that 
\begin{eqnarray}
  \Pr\left[\left|\{j \in [s]\,\mid\, (\hat{B}_j,\hat{\sigma}_j) \tn{ satisfied by h}\}\right| \geq \alpha s\right] \leq \nonumber \\ \P\left[\sum_{\ell = 1}^s X_\ell \geq \alpha s\right] \label{eqn:randomAlg-1}
\end{eqnarray}
where each $X_\ell$ ($\ell = 1, \dots, s$) is an independent $\{0,1\}$-valued Bernoulli random variable taking value $1$ with probability $\alpha /2$. Therefore, using Chernoff Upper Tail bound from Lemma \ref{lemma:chernoff_bounds} we can upper bound the LHS of \eqref{eqn:randomAlg-1} by $\tn{exp}(-\alpha s/6)$ which is the upper bound on the probability that $h$ has accuracy $\geq \alpha$ on  $\hat{\mc{B}}$.

Let $\mc{C}$ be the classifier class to which the output of  $\mc{O}_{kt, \alpha}$ is guaranteed to belong. With $n$ being the total number of distinct feature-vectors in the bags $\mc{B}$, $\Pi_{\mc{C}}(n)$ (as given in Theorem \ref{theorem:vcdim_growth_function}) is the number of possible $\{0,1\}$-assignments to $n$ points induced by classifiers in $\mc{C}$. Taking a union-bound over all of them, we obtain that with probability at most $\Pi_{\mc{C}}(n)\tn{exp}(-\alpha s/6)$ the output of $\mc{A}_2$ has accuracy at least $(1 - \eps)$ on $\mc{B}$. 

When $\mc{C}$ is unrestricted then $\Pi_{\mc{C}}(n) \leq 2^n$ and therefore $\Pi_{\mc{C}}(n)\tn{exp}(-\alpha s/6) \leq \delta$ is ensured by taking $s = O\left((n + \log(1/\delta))/\alpha\right)$. On the other hand if the VC dimension of $\mc{C}$ is at most $r$, then $\Pi_{\mc{C}}(n) \leq (en/r)^r$ (from Theorem \ref{theorem:vcdim_growth_function}) , and therefore taking $s = O\left(\frac{r}{\alpha}\log\left(\frac{n}{r}\right) + \log\left(\frac{1}{\delta}\right)\right)$ suffices.

We include Figure \ref{fig:w2s_llp_algorithm} illustrating how our algorithm trains a strong classifier for original (small) bags using a weak classifier trained on composite (large) bags.
