Keywords: Online linear regression, Multi-armed bandit
TL;DR: We propose a new analysis for the celebrated forward algorithm in the setting of stochastic online linear regression, and show the benefits of it replacing ridge regression whenever possible.
Abstract: We consider the problem of online linear regression in the stochastic setting. We derive high probability regret bounds for online $\textit{ridge}$ regression and the $\textit{forward}$ algorithm. This enables us to compare online regression algorithms more accurately and eliminate assumptions of bounded observations and predictions. Our study advocates for the use of the forward algorithm in lieu of ridge due to its enhanced bounds and robustness to the regularization parameter. Moreover, we explain how to integrate it in algorithms involving linear function approximation to remove a boundedness assumption without deteriorating theoretical bounds. We showcase this modification in linear bandit settings where it yields improved regret bounds. Last, we provide numerical experiments to illustrate our results and endorse our intuitions.
Submission History: Yes
Submission History - Venue And Year: UAI 2021
Submission History - Improvements Made: Previous reviewers found the writing of the paper confusing because of inconsistent notations, and because we provided a back-of-the-envelope sketch of a lower bound to support an optimality claim, which they found hurried. However, reviewers also believed that "the paper will be a valuable addition to the existing literature" after a revision in light of their remarks.
In our efforts to improve the quality of our paper, we revamped the writing by 1) correcting all inconsistent notations. 2) re-organizing the structure to present a better flow of ideas for the reader, and rewriting several passages that were previously ambiguous. 3) deriving a rigorous lower bound to eliminate any doubt about the soundness of our results.
Checklist: Yes, we completed the NeurIPS 2021 paper checklist, and have included it in our PDF.
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
Supplementary Material: zip
Thumbnail: No thumbnail
12 Replies
Loading