\section{Preliminaries}
\label{sec:preliminaries}

\subsection{Problem Formulation}
\label{problem_formulation}

Consider an unlabeled graph $G = (V, E)$ comprising $N$ vertices $V = \{v_1, ..., v_N\}$ and $M$ edges $E = \{e_1, ..., e_M\}$. Each vertex $v_i \in V$ represents an instance linked to a true label $l_i \in \{+1, -1\}$. The true label of each instance is characterized by $\theta_{v_i} = P(l_i = +1) \in [0, 1]$, while the correlation between instances is represented by $\omega_{e_k} = P(l_i = l_j) \in [0, 1]$, where $e_k = (v_i, v_j) \in E$. Notably, we do not consider correlations between vertices that lack a connecting edge. Following the framework of \citep{pmlr-v216-kulkarni23a}, we assume that all workers are equally reliable, meaning the labels they provide for any vertex $v_i \in V$ at a given timestamp $t$ (denoted by $y_{{v_{i}}_{t}}$) are drawn from the underlying label distribution: $y_{{v_{i}}_{t}} \sim Bernoulli (\theta_{v_{i}})$. While we acknowledge that real-world crowdsourcing scenarios can be more intricate than simply drawing worker labels from a Bernoulli distribution, previous studies \citep{chen2013optimistic, li2016crowdsourcing} suggest that this assumption is generally valid for real-world datasets.

Given a labeling budget of $T$ where each worker label costs one unit, our goal is to minimize the uncertainty in the estimation of $\theta_{v_i}$ for every vertex $v_i \in V$ and $\omega_{e_k}$ for every edge $e_k \in E$.

\subsection{Instance Selection: KG and OPTKG}
\label{existing_approximate_policies}

The Knowledge Gradient (KG)~\citep{frazier2008knowledge} and Optimistic Knowledge Gradient (OPTKG)~\citep{chen2013optimistic} frameworks treat each instance as independent and identically distributed (i.i.d.), proposing strategies to select instances for label acquisition at each timestamp. Knowledge Gradient (KG) employs a single-step look-ahead approach that greedily identifies the next instance with the highest expected reward defined in Eq. (\ref{eq1})
\begin{align}\label{eq1}
    &v_t = \underset{v}{\mathrm{argmax}} \left( R(\mathbf{S^{t}}, v)\right), \text{where}
    \\ 
    &R(\mathbf{S^{t}}, v)\dot=\ p_1*R_{1}(a_{v}^{t}, b_{v}^{t}) + p_2*R_{2}(a_{v}^{t}, b_{v}^{t}),   \nonumber
\end{align}
In contrast, OPTKG selects the next instance based on an optimistic projection of the reward as shown in Eq. (\ref{eq2})
\begin{align}\label{eq2}
    &v_t = \underset{v}{\mathrm{argmax}} \left( R^{+}(\mathbf{S^{t}}, v)\right), \text{where}
    \\ & R^{+}(\mathbf{S^{t}}, v) \dot=\ \max(R_{1}(a_{v}^{t}, b_{v}^{t}), R_{2}(a_{v}^{t}, b_{v}^{t})).     \nonumber
\end{align}
In both Eq. (\ref{eq1}) and Eq. (\ref{eq2}), $a_{v}^{t}$ and $b_{v}^{t}$ denote the counts of positive and negative labels for vertex $v$ at timestamp $t$. The posterior probabilities $p_1$ and $p_2$ are calculated as $p_1 = \frac{a_{v}^{t}}{a_{v}^{t} + b_{v}^{t}}$ and $p_2 = \frac{b_{v}^{t}}{a_{v}^{t} + b_{v}^{t}}$, respectively, representing the likelihoods of vertex $v$ being labeled $+1$ or $-1$. Additionally, the rewards for obtaining labels $+1$ and $-1$ for vertex $v$ are denoted as $R_{1}(a_{v}^{t}, b_{v}^{t})$ and $R_{2}(a_{v}^{t}, b_{v}^{t})$, respectively.