Predictive Coding for Decision Transformer

Tung Minh Luu, Donghoon Lee, Chang D. Yoo

Published: 12 Oct 2024, Last Modified: 30 Sept 2024OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Recent work in offline reinforcement learn- ing (RL) has demonstrated the effectiveness of formulating decision-making as return-conditioned supervised learning. No- tably, the decision transformer (DT) architecture has shown promise across various domains. However, despite its initial success, DTs have underperformed on several challenging datasets in goal-conditioned RL. This limitation stems from the inefficiency of return conditioning for guiding policy learning, particularly in unstructured and suboptimal datasets, resulting in DTs failing to effectively learn temporal compositionality. Moreover, this problem might be further exacerbated in long- horizon sparse-reward tasks. To address this challenge, we pro- pose the Predictive Coding for Decision Transformer (PCDT) framework, which leverages generalized future conditioning to enhance DT methods. PCDT utilizes an architecture that extends the DT framework, conditioned on predictive codings, enabling decision-making based on both past and future factors, thereby improving generalization. Through extensive experi- ments on eight datasets from the AntMaze and FrankaKitchen environments, our proposed method achieves performance on par with or surpassing existing popular value-based and transformer-based methods in offline goal-conditioned RL. Furthermore, we also evaluate our method on a goal-reaching task with a physical robot.