\section{Conclusion}
\label{sec:conclusion}

In this study, we tackle the budget allocation problem as an optimization challenge aimed at minimizing the expected uncertainty surrounding instance labeling and correlation estimation. Leveraging a Markov Decision Process (MDP) framework, we break down the final expected uncertainty into stage-wise rewards that quantify the change in entropy for all vertices and edges across two timestamps. We employ a Random Forest Regression model to estimate the marginal probabilities of edges representing instance correlations, while belief propagation is utilized to disseminate labeling information throughout the graph. We introduce two approximate policies: OPTUENT-EXP, which selects the instance with the highest expected reward, and OPTUENT-OPT, which targets the highest optimistic reward at each timestamp. Our empirical results show that the proposed approaches accurately estimate correlations between adjacent nodes and substantially reduce labeling costs. These findings underscore the value of uncertainty-guided decision-making under tight budget constraints and its potential to generalize to large-scale, real-world graph labeling tasks.

\section{Acknowledgements}

Adithya, Mohna, and Qi were supported in part by the National Science Foundation under NSF grant IIS-2007941. Sihong Xie was supported by the Department of Science and Technology of Guangdong Province (Grant No. 2023CX10X079), the National Key R\&D Program of China (Grant No. 2023YFF0725001), the Guangzhou-HKUST(GZ) Joint Funding Program (Grant No. 2023A03J0008), and the Education Bureau of Guangzhou Municipality.