\section{Preliminaries} \label{sec:prelims}
\subsection{Notations and Problem Definition} \label{sec:notations}
Let $\mbc{X}$ be a set of real feature-vectors i.e., $\mbc{X}\subseteq \R^{d_0}$ for some $d_0 \in \Z^+$. A  \emph{bag} $B$ is a subset of $\mbc{X}$. 
An instance $\mc{I}$ of \pmir  consists of a collection $\mc{B}$ of $m$ bags $\{B_1, \dots, B_m\}$ along with a label vector ${\bm \sigma} = (\sigma_1, \dots, \sigma_m)$ with the goal being to find:
\begin{itemize}[noitemsep,nolistsep]
    \item a predictor $h : \mbc{X} \to \R$, and
    \item an assignment $\Gamma : \mc{B} \to \mbc{X}$ s.t. $\Gamma(B) \in B$\ \ $\forall B \in \mc{B}$ indicating the \emph{primary} instance i.e., feature-vector for each bag
\end{itemize}
minimizing the following objective
\begin{equation}
    \val\left(L_{\tn{reg}}, \mc{I}, h, \Gamma\right) := \E_{j\in [m]} \left[L_{\tn{reg}}\left(\sigma_j, h\left(\Gamma(B_j)\right)\right)\right] \label{eqn-valIhGamma}
\end{equation}
for some loss function $L_{\tn{reg}}$. For convenience we subsume the optimization over $\Gamma$ by defining:
\begin{equation}
    \val\left(L_{\tn{reg}}, \mc{I}, h\right) := \min_{\Gamma} \E_{j\in [m]} \left[L_{\tn{reg}}\left(\sigma_j, h\left(\Gamma(B_j)\right)\right)\right] \label{eqn-valIh}
\end{equation}

If there are no other constraints on $\Gamma$, then clearly $\val\left(L, \mc{I}, h\right)$ is minimized when $\Gamma(B_j) = \tn{arg min}_{\bx \in B} L_{\tn{reg}}(\sigma_j, h(\bx))$, $j \in [m]$. In the case the bags are overlapping one may add the constraints that for each  $\bx$, $\left|\{B \in \mc{B}\,\mid\, \bx \in B, \Gamma(B) = \bx\}\right| \leq 1$ i.e., an instance may be primary for at most one bag. We shall refer to this problem as \emph{injective} \pmir.


For brevity we shall denote by $\val_p\left(\mc{I}, h\right)$ the LHS of \eqref{eqn-valIh} when $L_{\tn{reg}}(a, b) := |a - b|^p$, for any $p \geq 1$. In particular, $\val_2$ uses the mse-loss.

Let $\mc{D}$ be a distribution on $\mbc{X}$. % 
For some $f : \mbc{X} \to [0,R]$ and $k \in \Z^{+}$, an instance of \iidpmir$[f, k, m]$ is a random problem instance $\mc{I}$ of \pmir with $m$ bags where independently for each $j \in [m]$: \\
(i) bag $B_j = \{\bx_{1j}, \dots, \bx_{kj}\}$,  where $\bx_{ij} \sim \mc{D}$, independently for $i = 1,\dots, k$, and \\(ii) $\sigma_j = f(\bx_{1j})$.

\subsection{Useful Concepts and Tools}\label{sec:usefulconcepts}
For our generalization error bound, we shall restrict ourselves to a class $\mc{F}$ of real-valued functions (regressors) over $\mbc{X}$ with values i.e., predictions in $[0, R]$ for some $R \in \R$ s.t. $R \geq 1$. For any $\mbc{X}' \subseteq \mbc{X}$ s.t. $|\mbc{X}'| = N$, let $\mc{C}_p(\xi, \mc{F}, \mbc{X}')$ denote a minimum cardinality $\ell_p$-metric $\xi$-cover of $\mc{F}$ over $\mbc{X}'$, for some $\xi > 0$. Specifically, $\mc{C}_p(\xi, \mc{F}, \mbc{X}')$ is a minimum sized subset of $\mc{F}$ such that for each $f^* \in \mc{F}$, there exists $f \in \mc{C}_p(\xi, \mc{F}, \mbc{X}')$ s.t. $\left(\E_{\bx \in \mbc{X}'}\left[\left|f^*(\bx) - f(\bx)\right|^p\right]\right)^{1/p} \leq \xi$ for $p \in [1,\infty)$, and $\max_{\bx\in \mbc{X}'}\left|f^*(\bx) - f(\bx)\right| \leq \xi$ for $p =\infty$.


The maximum size of such a cover over all choices of $\mbc{X}' \subseteq \mbc{X}$ s.t. $|\mbc{X}'| = N$ is defined to be  $N_p(\xi, \mc{F}, N)$. In other words, such a cover of size $N_p(\xi, \mc{F}, N)$ always exists for $p = [1,\infty]$. We refer the reader to Sections 10.2-10.4 of \citep{Anthony-Bartlett} for more details (see also Chapter 2.2 of \cite{VW96}).

The \emph{pseudo-dimension} of $\mc{F}$, ${\sf Pdim}(\mc{F})$ is a measure of the complexity of the of $\mc{F}$. As described in Sec. 10.4 and 12.3 of \citep{Anthony-Bartlett}, the pseudo-dimension can be used to bound the size of covers for $\mc{F}$ as follows:
\begin{equation}
    N_1(\xi, \mc{F}, N) \leq N_\infty(\xi, \mc{F}, N) \leq (eNR/\xi d)^d
\end{equation}
where $d = {\sf Pdim}(\mc{F})$ and $N \geq d$.

