Learning to Modulate pre-trained Models in RL

Thomas Schmied; Markus Hofmarcher; Fabian Paischer; Razvan Pascanu; Sepp Hochreiter

Learning to Modulate pre-trained Models in RL

Thomas Schmied, Markus Hofmarcher, Fabian Paischer, Razvan Pascanu, Sepp Hochreiter

Published: 03 Mar 2023, Last Modified: 20 Jul 2025RRL 2023 OralReaders: Everyone

Keywords: Reinforcement Learning, Transformer, Multi-task learning, Continual learning, NLP

Abstract: Reinforcement Learning (RL) has experienced great success in complex games and simulations. However, RL agents are often highly specialized for a particular task, and it is difficult to adapt a trained agent to a new task. In supervised learning, an established paradigm is multi-task pre-training followed by fine-tuning. A similar trend is emerging in RL, where agents are pre-trained on data collections that comprise a multitude of tasks. Despite these developments, it remains an open challenge how to adapt such pre-trained agents to novel tasks while retaining performance on the pre-training tasks. In this regard, we pre-train an agent on a set of tasks from the Meta-World benchmark suite and adapt it to tasks from Continual-World. We conduct a comprehensive comparison of fine-tuning methods originating from supervised learning in our setup. Our findings show that fine-tuning is feasible, but for existing methods, performance on previously learned tasks often deteriorates. Therefore, we propose a novel approach that avoids forgetting by modulating the information flow of the pre-trained model. Our method outperforms existing fine-tuning approaches, and achieves state-of-the-art performance on the Continual-World benchmark. To facilitate future research in this direction, we collect datasets for all Meta-World tasks and make them publicly available.

Track: Technical Paper

Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/learning-to-modulate-pre-trained-models-in-rl/code)

2 Replies

Loading