Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing

Davin Choo; Yuqi Pan; Tonghan Wang; Milind Tambe; Alastair van Heerden; Cheryl Johnson

Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing

Davin Choo, Yuqi Pan, Tonghan Wang, Milind Tambe, Alastair van Heerden, Cheryl Johnson

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: sequential decision making, graphs, branching bandits, gittins index, network-based testing, public health

TL;DR: We develop a policy for adaptive exploration on a graph under frontier constraint that is optimal for trees based on Gittins index, and show how we can apply it to network-based testing in public health settings

Abstract: We study a sequential decision-making problem on a $n$-node graph $\mathcal{G}$ where each node has an unknown label from a finite set $\mathbf{\Omega}$, drawn from a joint distribution $\mathcal{P}$ that is Markov with respect to $\mathcal{G}$. At each step, selecting a node reveals its label and yields a label-dependent reward. The goal is to adaptively choose nodes to maximize expected accumulated discounted rewards. We impose a frontier exploration constraint, where actions are limited to neighbors of previously selected nodes, reflecting practical constraints in settings such as contact tracing and robotic exploration. We design a Gittins index-based policy that applies to general graphs and is provably optimal when $\mathcal{G}$ is a forest. Our implementation runs in $\mathcal{O}(n^2 \cdot |\mathbf{\Omega}|^2)$ time while using $\mathcal{O}(n \cdot |\mathbf{\Omega}|^2)$ oracle calls to $\mathcal{P}$ and $\mathcal{O}(n^2 \cdot |\mathbf{\Omega}|)$ space. Experiments on synthetic and real-world graphs show that our method consistently outperforms natural baselines, including in non-tree, budget-limited, and undiscounted settings. For example, in HIV testing simulations on real-world sexual interaction networks, our policy detects nearly all positive cases with only half the population tested, substantially outperforming other baselines.

Primary Area: Optimization (e.g., convex and non-convex, stochastic, robust)

Submission Number: 8387

Loading