The Gain from Ordering in Online Learning

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: Online Learning, Self-directed Learning, Hardness of Approximation
TL;DR: We study the effect of ordering the examples in fixed-design online learning and present novel approximation algorithms and hardness results.
Abstract: We study fixed-design online learning where the learner is allowed to choose the order of the datapoints in order to minimize their regret (aka self-directed online learning). We focus on the fundamental task of online linear regression: the learner is given a dataset $X$ with $n$ examples in $d$ dimensions and at step $t$ they select a point $x_t \in X$, predict a value $\widetilde y_t$, and suffer loss $(\widetilde y_t - w^\ast \cdot x_t)^2$. The goal is to design algorithms that order the examples and achieve better regret than random- or worst-order online algorithms. For an arbitrary dataset $X$, we show that, under the Exponential Time Hypothesis, no efficient algorithm can approximate the optimal (best-order) regret within a factor of $d^{1/\poly(\log \log d)}$. We then show that, for structured datasets, we can bypass the above hardness result and achieve nearly optimal regret. When the examples of $X$ are drawn i.i.d.\ from the uniform distribution on the sphere, we present an algorithm based on the greedy heuristic of selecting ``easiest'' examples first that achieves a $\log d$-approximation of the optimal regret.
Supplementary Material: pdf
Submission Number: 14696
Loading