Quasimetric Decision Transformers: Enhancing Goal-Conditioned Reinforcement Learning with Structured Distance Guidance

Quasimetric Decision Transformers: Enhancing Goal-Conditioned Reinforcement Learning with Structured Distance Guidance

TMLR Paper5187 Authors

23 Jun 2025 (modified: 26 Jun 2025)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Recent works have shown that tackling offline reinforcement learning (RL) with a conditional policy produces promising results. Decision Transformers (DT) have shown promising results in offline reinforcement learning by leveraging sequence modeling. However, standard DT methods rely on return-to-go (RTG) tokens, which are heuristically defined and often suboptimal for goal-conditioned tasks. In this work, we introduce Quasimetric Decision Transformer (QuaD), a novel approach that replaces RTG with learned quasimetric distances, providing a more structured and theoretically grounded guidance signal for long-horizon decision-making. We explore two quasimetric formulations: interval quasimetric embeddings (IQE) and metric residual networks (MRN), and integrate them into DTs. Extensive evaluations on the AntMaze benchmark demonstrate that QuaD outperforms standard Decision Transformers, achieving state-of-the-art success rates and improved generalization to unseen goals. Our results suggest that quasimetric guidance is a viable alternative to RTG, opening new directions for learning structured distance representations in offline RL.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Florian_Shkurti1

Submission Number: 5187

Loading