\section{Impossibility of Boosting in MIL} \label{sec:impossibility_mil}



Along similar lines as the previous section, we provide a geometric construction of MIL on $2$-sized bags. We begin with a continuous set of points which we analyze and subsequently discretize while preserving its key properties. We fix a parameter $\alpha \in (1/2, 1)$.

\noindent
{\bf Construction.} Let $\bm{\mc{X}}_c$ be set of all points on the unit circle $\mathbb{S}^{1}$. For any two points that subtend an angle of exactly $\alpha \pi$ we create a $2$-sized bag with aggregate label $1$ (we call it a $1$-bag) containing those points. Similarly, bags with aggregate label $0$ (which we call $0$-bags) are formed by pairs of points at an angle of $(1-\alpha)\pi$. 
By mapping a $1$-bag to the mid-point of the smaller arc subtended by the two points in the bag (end-points), and noting that all the $1$-bags have unique mid-points, we obtain that the measure of the set of $1$-bags is same as that of $\mathbb{S}^1$. Similarly, this holds true for the set of $0$-bags.  To construct a measure, define the following bag-sampling procedure: sample a uniform point on the unit circle and randomly output either the unique $1$-bag corresponding to it with probability $1/2$  or the unique 
$0$-bag corresponding to it with probability $1/2$.
In particular, the set of $0$-bags and the set of $1$-bags are of equal measure. Let $\mc{B}_c$ be this infinite (continuous) collection of $1$-bags and $0$-bags. 

\noindent
{\bf Existence of Weak Classifier.}
Observe that the constant $0$ classifier given by ${\sf pos}(-1)$ will satisfy all $0$-bags and none of the $1$-bags.

Now, consider a random homogeneous halfspace given by ${\sf pos}(\br^{\sf T}\bx)$ for $\br$ uniformly sampled from $\mathbb{S}^1$. The two points of a $0$-bag will not be separated w.p. $\alpha$ and conditioned on this, with probability $1/2$ both will be assigned $0$, implying that any $0$-bag will be satisfied with probability $\alpha/2$. On the other hand, both the points of a $1$-bag will be assigned $0$ w.p. $(1 - \alpha)/2$ implying that it will be satisfied w.p. $(1 + \alpha)/2$. 

Let there be any probability measure on $\mc{B}_c$ s.t. the measure of the $0$-bags is $p$ and that of the $1$-bags is $(1-p)$. If $p \geq 2/3$ then the constant $0$ classifier satisfies all the $0$-bags yielding an accuracy of $p \geq 2/3$. If not, then the random homogeneous halfspace satisfies in expectation
\begin{eqnarray}
    p\alpha/2 + (1-p)(1 + \alpha)/2 & = & (1 + \alpha)/2 - p/2 \nonumber \\ & \geq & 1/2 + \alpha/2 - 1/3 \nonumber \\ & = & 2/3 - (1 -\alpha)/2 \label{eqn:weak-multi}
\end{eqnarray}
Therefore, there is always a weak classifier, for any reweighing of the bags, of accuracy $2/3 - (1-\alpha)/2$.







\noindent
{\bf No Strong Classifier.}
Consider any $\{0,1\}$-labeling of $\mathbb{S}^1$, where the subset labeled $1$ is measurable.
Let $z \in [0, 1]$ represent the fraction of points on $\mathbb{S}^1$ labeled as 1, with the remaining fraction $1-z$ labeled as 0. Sampling a $0$-bag uniformly at random (u.a.r.) and randomly choosing one of its points yields the uniform distribution over $\mathbb{S}^1$. Thus, the probability that a random $0$-bag  is satisfied is $\leq 1 - z$. 
Each point in $\mathbb{S}^1$ is an element of exactly two distinct $1$-bags, so the probability that in a random $1$-bag at least one of its points is labeled $1$ is at most $\min\{2z, 1\}$.

Therefore, the probability that a random bag from $\mc{B}_c$ is satisfied by the labeling is at most
\begin{equation}
    \frac{1-z + \min\{2z,1\}}{2} = \begin{cases} 1 - z/2 & \tn{ if } z \geq 1/2 \\
                                                 1/2 + z/2 & \tn{ otherwise}
                                    \end{cases}
\end{equation}
which attains a maximum of $3/4$ at $z = 1/2$. Thus, no classifier can have accuracy $ > 3/4$ on $\mc{B}_c$


\noindent
{\bf Discretization.} Let $T$ be a large positive integer, and divide $\mathbb{S}^1$ into $2T$ continuous, non-overlapping arcs $\{A_i\}_{i=1}^{2T}$ of length $\delta \pi$ each, where $\delta = 1/T$. We choose $T$ large enough so that $2\delta < \min\{(2\alpha -1), (1 - \alpha)\}$, ensuring that: \\
(i) there is no segment that contains both endpoints of any bag in $\mc{B}_c$, and \\
(ii) for any pair of segments $A_i$ and $A_j$, if there is a $0$-bag in $\mc{B}_c$ with one point in $A_i$ and another in $A_j$, then there is no such $1$-bag, and similarly if there is a $1$-bag in $\mc{B}_c$ with one point in $A_i$ and another in $A_j$, then there is no such $0$-bag.

Using property (ii) above, let us construct a discrete set of bags $\mc{B}_d$ as follows. If a pair of segments $A_i$ and $A_j$ are such that there is a $0$-bag in $\mc{B}_c$ with one point in $A_i$ and another in $A_j$, then add $\{A_i, A_j\}$ as $0$-bag with weight as the measure of all the bags in $\mc{B}_c$ (which are necessarily $0$-bags) with one point in $A_i$ and another in $A_j$. Analogously, add pairs of segments as $1$-bags.
Note that from property (i), all bags in $\mc{B}_d$ have size $2$.

{\it No Strong Classifier.} Let us first consider any $\{0,1\}$-labeling to $\{A_i\}_{i=1}^{2T}$. This directly corresponds to a $\{0,1\}$-labeling to $\mathbb{S}^1$ by assigning a point the label of the segment containing it. Further, from its construction, the weight of the bags $\mc{B}_c$ satisfied by the labeling to the segments equals the measure of the bags in $\mc{B}_c$ satisfied by the corresponding labeling to $\mathbb{S}^1$ which, as shown above, is at most $3/4$.

In particular, the above argument also shows that the measure of bags in $\mc{B}_d$ satisfied by the constant $0$ labeling to $\{A_i\}_{i=1}^{2T}$ is the same as that in $\mc{B}_c$ satisfied by the constant $0$ labeling to  $\mathbb{S}^1$.

{\it Weak Classifier.} Lastly, we translate the labeling by a homogeneous halfspace on  $\mathbb{S}^1$ to a labeling for $\{A_i\}_{i=1}^{2T}$ by assigning each $A_i$ the label of its mid-point. Consider the \emph{error} set of points in $\mathbb{S}^1$ whose label given by the homogeneous halfspace differs from the label of the segment containing it. For any homogeneous halfspace, the error set is entirely contained within the two diametrically opposite segments intersected by the halfspace. Similarly, the \emph{error} bags in $\mc{B}_c$ are those  whose aggregate label given by the homogeneous halfspace differs from the aggregate label of the corresponding bag in $\mc{B}_d$.

The \emph{error} bags in $\mc{B}_c$ are a subset of those which have at least one end-point in the the error set of points. Given any bag in $\mc{B}_c$ the probability over a random homogeneous halfspace that it is an error bag is at most the probability that one of its endpoints is in a segment intersected by the halfspace. By symmetry, a segment is intersected with probability $1/T$. So the probability that any bag in $\mc{B}_c$ is an error bag is at most $2/T = 2\delta$.

Thus, from \eqref{eqn:weak-multi} we obtain that for any weighing of the bags in $\mc{B}_d$, there is a classifier of accuracy $2/3 - (1-\alpha)/2 - 2\delta$.

\subsection{Completing the proof of Theorem \ref{thm:MIL-imposssibility}.} For this, we can take $\eps$ to be small enough, say $\eps \in (0, 0.1)$ and set $\alpha = 1 - \eps$ along with $T = \lceil 4/\eps\rceil$ so that $\delta \leq \eps/4$ and $2\delta < \min\{(2\alpha -1), (1 - \alpha)\}$ and $2/3 - (1-\alpha)/2 - 2\delta \geq 2/3 - \eps$.













