Offline Reinforcement Learning with Adaptive Feature Fusion

Offline Reinforcement Learning with Adaptive Feature Fusion

ICLR 2026 Conference Submission22641 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Reinforcement Learning, Trajectory Stitching

TL;DR: We propose the Q-Augmented Dual-Feature Fusion Decision Transformer (QDFFDT), which adaptively integrates global sequential and local immediate features, achieving state-of-the-art results on D4RL benchmark tasks.

Abstract: Return-conditioned supervised learning (RCSL) algorithms have demonstrated strong generative capabilities in offline reinforcement learning (RL) by learning action distributions based on both the state and the return. However, many existing approaches treat RL as a conditional sequence modeling task, which can lead to an overreliance on suboptimal past experiences, impairing decision-making and reducing the effectiveness of trajectory synthesis. To address these limitations, we propose a novel approach, the Q-Augmented Dual-Feature Fusion Decision Transformer (QDFFDT), which adaptively combines both global sequence features and local immediate features through a learnable fusion mechanism. This model improves generalization across different tasks without the need for extensive hyperparameter tuning. Experimental results on the D4RL benchmark show that QDFFDT outperforms current methods, establishing new state-of-the-art performance and demonstrating the power of adaptive feature fusion.

Primary Area: reinforcement learning

Submission Number: 22641

Loading