Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies

Daniel Lawson; Ahmed H Qureshi

Merging Decision Transformers: Weight Averaging for Forming Multi-Task Policies

Daniel Lawson, Ahmed H Qureshi

Published: 03 Mar 2023, Last Modified: 06 Jul 2025RRL 2023 SpotlightReaders: Everyone

Keywords: Decision Transformers, merging, transfer learning, offline reinforcement learning

TL;DR: We merge decision transformers trained on different environments to create multi-task models without centralized training.

Abstract: Recent work has shown the promise of creating generalist, transformer-based, policies for language, vision, and sequential decision-making problems. To create such models, we generally require centralized training objectives, data, and compute. It is of interest if we can more flexibly create generalist policies, by merging together multiple, task-specific, individually trained policies. In this work, we take a preliminary step in this direction through merging, or averaging, subsets of Decision Transformers in weight space trained on different MuJoCo locomotion problems, forming multi-task models without centralized training. We also propose that when merging policies, we can obtain better results if all policies start from common, pre-trained initializations, while also co-training on shared auxiliary tasks during problem-specific finetuning. In general, we believe research in this direction can help democratize and distribute the process of which forms generally capable agents.

Track: Technical Paper

Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/merging-decision-transformers-weight/code)

2 Replies

Loading