Interpreting Decision Transformer: Insights from Continuous Control Tasks

Published: 01 Jan 2024, Last Modified: 06 Aug 2025ICONIP (1) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Decision Transformers have demonstrated impressive performance in offline reinforcement learning, but their complex internal mechanisms remain challenging to interpret. This paper presents a comprehensive analysis of Decision Transformers(DT) trained on various MuJoCo environments using different interpretability techniques. Our analysis includes: (a) positional encoding (PE) analysis to assess the impact of temporal information on performance in different types of environments, (b) return-to-go (RTG) analysis to explore the variability in achieving specific returns, (c) embedding analysis revealing the models’ ability to learn detailed representations of the environments, (d) attention visualizations to understand the role of attention in behaviour guidance, and (e) perturbation studies to test robustness to input noise. Our analysis reveals the extent to which positional information is necessary for achieving high performance in different MuJoCo environments and provides insights into the model’s adaptability to varying task requirements. In addition, the embedding analysis shows hierarchical abstractions captured by the model across layers. Through these analyses, we provide novel insights into the decision-making process of transformers in continuous control domains. These insights are pivotal for designing more interpretable and reliable decision-making systems and deepen our comprehension of the capabilities and limitations of DT in complex tasks.
Loading