TL;DR: We provide the first private online algorithms for minimizing dynamic regret under stochastic, oblivious, and adaptive adversaries.
Abstract: We design differentially private algorithms for the problem of prediction with expert advice under dynamic regret, also known as tracking the best expert. Our work addresses three natural types of adversaries, stochastic with shifting distributions, oblivious, and adaptive, and designs algorithms with sub-linear regret for all three cases. In particular, under a shifting stochastic adversary where the distribution may shift $S$ times, we provide an $\epsilon$-differentially private algorithm whose expected dynamic regret is at most $O\left( \sqrt{S T \log (NT)} + \frac{S \log (NT)}{\epsilon}\right)$, where $T$ and $N$ are the time horizon and number of experts, respectively. For oblivious adversaries, we give a reduction from dynamic regret minimization to static regret minimization, resulting in an upper bound of $O\left(\sqrt{S T \log(NT)} + \frac{S T^{1/3}\log(T/\delta) \log(NT)}{\epsilon ^{2/3}}\right)$ on the expected dynamic regret, where $S$ now denotes the allowable number of switches of the best expert. Finally, similar to static regret, we establish a fundamental separation between oblivious and adaptive adversaries for the dynamic setting: while our algorithms show that sub-linear regret is achievable for oblivious adversaries in the high-privacy regime $\epsilon \le \sqrt{S/T}$, we show that any $(\epsilon, \delta)$-differentially private algorithm must suffer linear dynamic regret under adaptive adversaries for $\epsilon \le \sqrt{S/T}$. Finally, to complement this lower bound, we give an $\epsilon $-differentially private algorithm that attains sub-linear dynamic regret under adaptive adversaries whenever $\epsilon \gg \sqrt{S/T}$.
Lay Summary: In many online systems—like stock trading platforms or recommendation engines—decisions must adapt to changing environments. Traditionally, algorithms compare their performance to the best fixed decision in hindsight, but this can fail in dynamic settings where the "best choice" shifts over time. This paper studies how to design learning algorithms that track the best changing expert while also preserving user privacy. Specifically, we introduce differentially private algorithms that can adapt to such changes across three adversarial environments: stochastic (with shifting distributions), oblivious (fixed losses), and adaptive (strategic responses). Our algorithms achieve strong performance—called sublinear dynamic regret—in each setting while ensuring sensitive user data is protected. We also prove sharp limits: for example, learning becomes impossible if the privacy level is too strict under adaptive adversaries. This work is the first to comprehensively tackle the challenge of private, adaptive decision-making in changing environments.
Primary Area: Social Aspects->Privacy
Keywords: Differential Privacy, Online Learning
Submission Number: 8315
Loading