Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models

Task Tokens: A Flexible Approach to Adapting Behavior Foundation Models

ICLR 2026 Conference Submission25607 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Hierarchial Reinforcement Learning, Behavior Foundation Models, Humanoid Control

TL;DR: Task Tokens enable task-specific adaptation of behavior foundation models by learning a reinforcement-trained encoder, enhancing control without compromising generalization.

Abstract: Recent advancements in imitation learning for robotic control have led to transformer-based behavior foundation models (BFMs) that enable multi-modal, human-like control for humanoid agents. These models generate solutions when conditioned on high-level goals or prompts, for example, walking to a coordinate when conditioned on the position of the robot's pelvis. While excelling at zero-shot generation of robust behaviors, BFMs often require meticulous prompt engineering for specific tasks, potentially yielding suboptimal results. In this work, we introduce ``Task Tokens'' - a method to effectively tailor BFMs to specific tasks while preserving their flexibility. Our approach integrates naturally within the transformer architecture of BFMs. Task Tokens trains a task-specific encoder (tokenizer), with the original BFM remaining untouched. Our method reduces trainable parameters per task by up to $\times 125$ and converges up to $\times 6$ faster compared to standard baselines. In addition, by keeping the original BFM unchanged, Task Tokens enables utilizing the pre-existing encoders. This allows incorporating user-defined priors, balancing reward design and prompt engineering. We demonstrate Task Tokens' efficacy across various tasks, including out-of-distribution scenarios, and show their compatibility with other prompting modalities. Our results suggest that Task Tokens offer a promising approach for adapting BFMs to specific control tasks while retaining their generalization capabilities.

Supplementary Material: zip

Primary Area: reinforcement learning

Submission Number: 25607

Loading