Clusters Agnostic Network Lasso Bandits

Sofien Dhouib; Steven Bilaj; Behzad Nourani-Koliji; Setareh Maghsudi

Clusters Agnostic Network Lasso Bandits

Sofien Dhouib, Steven Bilaj, Behzad Nourani-Koliji, Setareh Maghsudi

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: multitask learning, contextual bandits, graph total variation, network lasso, clustering

TL;DR: We solve and theoretically analyze a multi-task contextual bandit problem given a task relation graph using a network lasso objective..

Abstract: We consider a multi-task contextual bandit setting, where the learner is given a graph encoding relations between the bandit tasks. The tasks' preference vectors are assumed to be piecewise constant over the graph, forming clusters. At every round, we estimate the preference vectors by solving an online network lasso problem with a suitably chosen, time-dependent regularization parameter. We establish a novel oracle inequality relying on a convenient restricted eigenvalue assumption. Our theoretical findings highlight the importance of dense intra-cluster connections and sparse inter-cluster ones. That results in a sublinear regret bound significantly lower than its counterpart in the independent task learning setting. Finally, we support our theoretical findings by experimental evaluation against graph bandit multi-task learning and online clustering of bandits algorithms.

Supplementary Material: zip

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 7922

Loading