Keywords: "Contextual Bandits"+"Self-Supervised Learning"+"Domain Adaptation"+"Representation Learning"+"Nonstationarity"
TL;DR: We introduce a self-supervised representation method for contextual bandits, enabling rapid adaptation and improved performance under domain shifts.
Abstract: We propose a new self-supervised domain adaptation framework for contextual bandits, addressing both abrupt and gradual environment shifts. Our method pretrains a compact representation on unlabeled data, then integrates it into both classical (e.g., LinUCB, TS) and neural bandit algorithms. Empirically, we show that our approach dramatically reduces regret and speeds adaptation across eight distinct domains, outperforming standard non-adaptive baselines and simpler autoencoder methods in final performance.
Submission Number: 1
Loading