Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer

Minh Hoang Nguyen; Linh Le Pham Van; Thommen Karimpanal George; Sunil Gupta; Hung Le

Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer

Minh Hoang Nguyen, Linh Le Pham Van, Thommen Karimpanal George, Sunil Gupta, Hung Le

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: reinforcement learning, decision transformer, causality

Abstract: Decision Transformer (DT) plays a crucial role in modern reinforcement learning, leveraging offline datasets to achieve impressive results across various domains. However, DT requires high-quality, comprehensive data to perform optimally. In real-world applications, such ideal data is often lacking, with the underrepresentation of optimal behaviours posing a significant challenge. This limitation highlights the difficulty of relying on offline datasets for training, as suboptimal data can hinder performance. To address this, we propose the Counterfactual Reasoning Decision Transformer (CRDT), a novel framework inspired by counterfactual reasoning. CRDT enhances DT’s ability to reason beyond known data by generating and utilizing counterfactual experiences, enabling improved decision-making in out-of-distribution scenarios. Extensive experiments across continuous and discrete action spaces, including environments with limited data, demonstrate that CRDT consistently outperforms conventional DT approaches. Additionally, reasoning counterfactually allows the DT agent to obtain stitching ability, allowing it to combine suboptimal trajectories. These results highlight the potential of counterfactual reasoning to enhance RL agents' performance and generalization capabilities.

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6788

Loading