GeoDT: Geometry-Aware Decision Transformer for Robust Safe Multi-Task Offline Reinforcement Learning
Abstract: Scaling offline reinforcement learning across heterogeneous tasks remains challenging, especially under safety constraints. In multi-task settings, features processed by a shared model may play different semantic roles across tasks, leading to semantic inconsistency, conflicting optimization signals, and performance degradation as task diversity increases. While prior multi-task and safe offline RL methods address parts of this challenge, few provide a unified framework that is both effective and safety-aware. We propose GeoDT (Geometry-Aware Decision Transformer), a framework for safe multi-task offline RL that biases cross-task sharing toward geometry-related trajectory structure to mitigate semantic inconsistency. GeoDT learns to separate geometry-related structure from task-specific semantics, constructs geometry-aware context from prompt trajectories through relational structure induction and prototype memory, and incorporates safety by using cost signals to shape the feasible region of geometric reuse. The resulting context is fused with task-specific semantic features to condition a cost-aware Decision Transformer. To better assess behavior as task diversity increases, we further introduce the Task Scaling Robustness Score (TSRS) and Inter-Task Balance Score (ITBS), which measure performance retention and cross-task balance as the number of tasks increases. Experiments on multi-task safe offline RL benchmarks show that GeoDT achieves strong reward--cost trade-offs, improved robustness under increasing task diversity, and zero-shot adaptation to unseen safety budgets compared with competitive baselines. These results suggest that geometry-related trajectory structure can provide an effective basis for safe multi-task offline reinforcement learning.
Submission Type: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Sebastian_Trimpe1
Submission Number: 8663
Loading