Improved Regret in Stochastic Decision-Theoretic Online Learning under Differential Privacy

Ruihan Wu; Yu-Xiang Wang

Improved Regret in Stochastic Decision-Theoretic Online Learning under Differential Privacy

Ruihan Wu, Yu-Xiang Wang

Published: 18 Dec 2025, Last Modified: 21 Feb 2026ALT 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: differential privacy, online learning

Abstract: Hu and M ehta [2024] posed an open problem: what is the optimal instance-dependent rate for the stochastic decision-theoretic online learning (with $K$ actions and $T$ rounds) under $\varepsilon$-differential privacy? Before, the best known upper bound and lower bound are $O\left(\frac{\log K}{\Delta_{\min}} + \frac{\log K\log T}{\varepsilon}\right)$ and $\Omega\left(\frac{\log K}{\Delta_{\min}} + \frac{\log K}{\varepsilon}\right)$ (where $\Delta_{\min}$ is the gap between the optimal and the second actions). In this paper, we partially address this open problem by having two new results. First, we provide an improved upper bound for this problem $O\left(\frac{\log K}{\Delta_{\min}} + \frac{\log^2K}{\varepsilon}\right)$, which is $T$-independent and only has a log dependency in $K$. Second, to further understand the gap, we introduce the deterministic setting, a weaker setting of this open problem, where the received loss vector is deterministic. At this weaker setting, a direct application of the analysis and algorithms from the original setting still leads to an extra log factor. We conduct a novel analysis which proves upper and lower bounds that match at $\Theta(\frac{\log K}{\varepsilon})$.

Submission Number: 15

Loading