Merging Decision Transformers: Weight Averaging for Forming Multi-Task PoliciesDownload PDF

Published: 03 Mar 2023, Last Modified: 29 Apr 2024RRL 2023 SpotlightReaders: Everyone
Keywords: Decision Transformers, merging, transfer learning, offline reinforcement learning
TL;DR: We merge decision transformers trained on different environments to create multi-task models without centralized training.
Abstract: Recent work has shown the promise of creating generalist, transformer-based, policies for language, vision, and sequential decision-making problems. To create such models, we generally require centralized training objectives, data, and compute. It is of interest if we can more flexibly create generalist policies, by merging together multiple, task-specific, individually trained policies. In this work, we take a preliminary step in this direction through merging, or averaging, subsets of Decision Transformers in weight space trained on different MuJoCo locomotion problems, forming multi-task models without centralized training. We also propose that when merging policies, we can obtain better results if all policies start from common, pre-trained initializations, while also co-training on shared auxiliary tasks during problem-specific finetuning. In general, we believe research in this direction can help democratize and distribute the process of which forms generally capable agents.
Track: Technical Paper
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/arxiv:2303.07551/code)
2 Replies

Loading