Learning Good Interventions in Causal Graphs via CoveringDownload PDF

Published: 08 May 2023, Last Modified: 26 Jun 2023UAI 2023Readers: Everyone
Keywords: Causal Bandits, Causal Inference, Multi-armed Bandits, Simple Regret
TL;DR: State-of-the-art guarantees for the causal bandit problem
Abstract: We study the causal bandit problem that entails identifying a near-optimal intervention from a specified set $\cal{A}$ of (possibly non-atomic) interventions over a given causal graph. Here, an optimal intervention in $\cal{A}$ is one that maximizes the expected value for a designated reward variable in the graph, and we use the standard notion of simple regret to quantify near optimality. Considering Bernoulli random variables and for causal graphs on $N$ vertices with constant in-degree, prior work has achieved a worst case guarantee of $\widetilde{O} (N/\sqrt{T})$ for simple regret. The current work utilizes the idea of covering interventions (which are not necessarily contained within $\cal{A}$) and establishes a simple regret guarantee of $\widetilde{O}(\sqrt{N/T})$. Notably, and in contrast to prior work, our simple regret bound depends only on explicit parameters of the problem instance. We also go beyond prior work and achieve a simple regret guarantee for causal graphs with unobserved variables. Further, we perform experiments to show improvements over baselines in this setting.
Supplementary Material: pdf
Other Supplementary Material: zip
0 Replies