Multi-task Representation Learning for Fixed Budget Pure-Exploration in Linear and Bilinear Bandits

Subhojyoti Mukherjee; Qiaomin Xie; Robert D Nowak

Multi-task Representation Learning for Fixed Budget Pure-Exploration in Linear and Bilinear Bandits

Subhojyoti Mukherjee, Qiaomin Xie, Robert D Nowak

Published: 09 May 2025, Last Modified: 15 Aug 2025RLC 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-task learning, Pure exploration, Linear Bandits, Bilinear Bandits

Abstract: In this paper, we study fixed-budget pure exploration settings for multi-task representation learning (MTRL) in linear and bilinear bandits. In fixed budget MTRL linear bandit setting the goal is to find the optimal arm of each of the tasks with high probability within a pre-specified budget. Similarly, in a fixed budget MTRL bilinear setting the goal is to find the optimal left and right arms of each of the tasks with high precision within the budget. In both of these MTRL settings, the tasks share a common low-dimensional linear representation. Therefore, the goal is to leverage this underlying structure to expedite learning and identify the optimal arm(s) of each of the tasks with high precision. We prove the first lower bound for the fixed-budget linear MTRL setting that takes into account the shared structure across the tasks. Motivated from the lower bound we propose the algorithm FB-DOE that uses a double experimental design approach to allocate samples optimally to the arms across the tasks, and thereby first learn the shared common representation and then identify the optimal arm(s) of each task. This is the first study on fixed-budget pure exploration of MTRL in linear and bilinear bandits. Our results show that learning the shared representation, jointly with allocating actions across the tasks following a double experimental design approach, achieves a smaller probability of error than solving the tasks independently.

Submission Number: 184

Loading