On the Universal Near Optimality of Hedge in Combinatorial Settings

Zhiyuan Fan; Arnab Maiti; Lillian J. Ratliff; Kevin Jamieson; Gabriele Farina

On the Universal Near Optimality of Hedge in Combinatorial Settings

Zhiyuan Fan, Arnab Maiti, Lillian J. Ratliff, Kevin Jamieson, Gabriele Farina

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 spotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Online learning, Hedge algorithm, Regret minimization, Combinatorial Games, Directed Acyclic Graphs, Dilated entropy, Online Mirror Descent

Abstract: In this paper, we study the classical Hedge algorithm in combinatorial settings. In each round, the learner selects a vector $\mathbf{x}_t$ from a set $\mathcal{X} \subseteq$ {$0,1$}$^d$, observes a full loss vector $\mathbf{y}_t \in \mathbb{R}^d$, and incurs a loss $\langle \mathbf{x}_t, \mathbf{y}_t \rangle \in [-1,1]$. This setting captures several important problems, including extensive-form games, resource allocation, $m$-sets, online multitask learning, and shortest-path problems on directed acyclic graphs (DAGs). It is well known that Hedge achieves a regret of $\mathcal{O}\big(\sqrt{T \log |\mathcal{X}|}\big)$ after $T$ rounds of interaction. In this paper, we ask whether Hedge is optimal across all combinatorial settings. To that end, we show that for any $\mathcal{X} \subseteq$ {$0,1$}$^d$, Hedge is near-optimal—specifically, up to a $\sqrt{\log d}$ factor—by establishing a lower bound of $\Omega\big(\sqrt{T \log(|\mathcal{X}|)/\log d}\big)$ that holds for any algorithm. We then identify a natural class of combinatorial sets—namely, $m$-sets with $\log d \leq m \leq \sqrt{d}$—for which this lower bound is tight, and for which Hedge is provably suboptimal by a factor of exactly $\sqrt{\log d}$. At the same time, we show that Hedge is optimal for online multitask learning, a generalization of the classical $K$-experts problem. Finally, we leverage the near-optimality of Hedge to establish the existence of a near-optimal regularizer for online shortest-path problems in DAGs—a setting that subsumes a broad range of combinatorial domains. Specifically, we show that the classical Online Mirror Descent (OMD) algorithm, when instantiated with the dilated entropy regularizer, is iterate-equivalent to Hedge, and therefore inherits its near-optimal regret guarantees for DAGs.

Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)

Submission Number: 18851

Loading