\begin{proof}
We construct two bandit problems, each with two arms, such that in the second bandit, the arms are swapped. The key observation is that the distributions of the observed outputs, when there is no mediator, are identical, but, the actual means of the arms differ, with \( \mu_1 - \mu_2 = \Delta \).

First, let us assume that we have constructed two arms with distributions \( P_1 \) and \( P_2 \), and identical distribution \( P \) when there is no mediator. Now, let for instance 1, arms 1 and 2 have distributions \( P_1 \) and \( P_2 \), respectively, while for instance 2, arms 1 and 2 have distributions \( P_2 \) and \( P_1 \), respectively.

Now, consider the scenario without a mediator. Building on the ideas from previous proofs, we obtain:

\[
\mathbb{E}_2[T_2(T)] \leq \mathbb{E}_1[T_2(T)] + T d_{TV}(P, P) = \mathbb{E}_1[T_2(T)],
\]

By symmetry, we similarly have:

\[
\mathbb{E}_1[T_2(T)] \leq \mathbb{E}_2[T_2(T)] + T d_{TV}(P, P) = \mathbb{E}_2[T_2(T)].
\]

Therefore, we conclude that:

\[
\mathbb{E}_1[T_2(T)] = \mathbb{E}_2[T_2(T)].
\]

Applying the same symmetry argument again, it follows that:

\[
\mathbb{E}_1[T_1(T)] = \mathbb{E}_2[T_1(T)].
\]

Thus, the expected number of pulls for both arms in both cases remains identical.


But now with the actual means we would have:

\[
\mathbb{E}[R_T(1)] +  \mathbb{E}[R_T(2)] = \mathbb{E}_1[T_2(T)]\Delta + \mathbb{E}_2[T_1(T)]\Delta  = \mathbb{E}_2[T_2(T)]\Delta  + \mathbb{E}_2[T_1(T)]\Delta  = T\Delta = \Omega(T)
\]

Therefore, there exists \( i \in \{1, 2\} \) such that:
\[
\mathbb{E}[R_T(1)]  \geq \Omega(T)
\]

Now lets construct arms with mentioned property. We assume that the rewards for each arm follow a discrete distribution. If \( f_Y(y) \) represent the probability mass function of \( Y \). We have:

\[
f_Y(y \mid a, O^Y = 1) = \sum\limits_{m} \mathbb{P}(m \mid a, O^Y = 1) f_Y(y \mid a, m, O^Y = 1) = \sum\limits_{m} \mathbb{P}(m \mid a, O^Y = 1) f_Y(y \mid a, m),
\]

and

\[
\mathbb{P}(m \mid a, O^Y = 1) = \frac{\mathbb{P}(m, a, O^Y = 1)}{\sum\limits_{m} \mathbb{P}(m, a, O^Y = 1)} = \frac{\gamma_{m, a} p_{m, a}}{\sum\limits_{m} \gamma_{m, a} p_{m, a}}.
\]

Now, if we let \( \gamma_{m, a} = \frac{\frac{1}{p_{m, a}}}{\sum\limits_{m} \frac{1}{p_{m, a}}} \), we have:

\[
f_Y(y \mid a, O^Y = 1) = \frac{\sum\limits_{m} f_Y(y \mid a, m)}{K},
\]

This expression is independent of \( p_{m, a} \). Additionally, we have:

\[
\mathbb{P}(O^Y = 1 \mid a) = \sum\limits_{m} p_{m, a} \gamma_{m, a} = \frac{K}{\sum\limits_{m} \frac{1}{p_{m, a}}},
\]

Now, for any choice of \( p_{m, a} \) such that both the set \( P_a = \{ p_{m, a} \mid \forall m \in \mathbb{M} \} \) and \( f_Y(y \mid a, m) \) remain identical for both arms, the resulting arm distributions in the absence of a mediator are still identical. Now, we choose appropriate \( p_{m, a} \) such that \( \mu_1 - \mu_2 = \Delta \).


Since \( \mu_a = \sum\limits_{m} p_{m, a} \mu_{m, a} \), we let all \( \mu_{m, a} \) be zero except for one, which we set to 1. Now, for arm \( a = 1 \), let \( p_{m, a} = 1 - \epsilon \) for the \( m \) such that \( \mu_{m, a} = 1 \), and set the others equal to \( \frac{\epsilon}{K - 1} \). For arm \( a = 2 \), let \( p_{m, a} = 1 - \epsilon \) for the \( m \) such that \( \mu_{m, a} \neq 1 \), and set the others equal to \( \frac{\epsilon}{K - 1} \).

In this way, \( \mu_1 - \mu_2 = 1 - \epsilon - \frac{\epsilon}{K - 1} = \Delta \), completing the construction for small \( \epsilon \).


\end{proof}