Track Selection: Short paper track.
Keywords: behaviour foundation models; transfer learning and generalization; world models
TL;DR: We leverage Behaviour Foundation Models to adapt to unseen dynamics functions.
Abstract: Reinforcement learning agents perform poorly when faced with unseen dynamics. Recent work on Behaviour Foundation Models (BFMs) has produced agents capable of solving many unseen \textit{tasks} in an environment assuming consistency between the dynamics described by the pre-training dataset and the testing environment. In this preliminary work, we relax this assumption and ask: can BFMs return performant policies for tasks in environments with different dynamics to that seen during training? We build on work that compensates for differences in dynamics by modifying the reward function the agent is trained against. We show that if the BFM's policy is prompted correctly, we can elicit behaviour required to solve a specific set of dynamics generalisation problems. We report some preliminary experiments on the ExORL benchmark and discuss next steps.
Submission Number: 11