Published: 01 Jan 2022, Last Modified: 12 May 2023UAI 2022Readers: Everyone
Abstract:In many real-world applications of multi-armed bandit problems, both rewards and contexts are often influenced by confounding latent variables which evolve stochastically over time. While the obser...