Learning to Adapt: Self-Supervised Representations for Robust Contextual Bandits

Janos Horvath

Learning to Adapt: Self-Supervised Representations for Robust Contextual Bandits

Janos Horvath

Published: 09 Jun 2025, Last Modified: 13 Jul 2025ICML 2025 Workshop SIM PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: "Contextual Bandits"+"Self-Supervised Learning"+"Domain Adaptation"+"Representation Learning"+"Nonstationarity"

TL;DR: We introduce a self-supervised representation method for contextual bandits, enabling rapid adaptation and improved performance under domain shifts.

Abstract: We propose a new self-supervised domain adaptation framework for contextual bandits, addressing both abrupt and gradual environment shifts. Our method pretrains a compact representation on unlabeled data, then integrates it into both classical (e.g., LinUCB, TS) and neural bandit algorithms. Empirically, we show that our approach dramatically reduces regret and speeds adaptation across eight distinct domains, outperforming standard non-adaptive baselines and simpler autoencoder methods in final performance.

Submission Number: 1

Loading