Abstract: Deep reinforcement learning (DRL) has achieved remarkable success in various domains, from games to complex tasks. In several DRL applications, the agents must perform long-horizon tasks in partially observable environments. However, owing to the high performance of DRL, the decision-making process for the long-horizon task is unclear and difficult to interpret. Although memory-based models and the Transformer model have been proposed to overcome the interpretability issue, challenges of limited flexibility, lack of stability, and high computational cost remain. To address these concerns, we propose a low-computational-complexity and scalable DRL model that uses attention-augmented memory (AAM) to interpret the long-horizon decision-making process. AAM adds long short-term memory states of past observations to the memory and uses a soft attention bottleneck to combine them into a single contextual vector. The agent is then trained to make decisions based on this AAM together with the current observation. The AAM model is applied to the navigation problem of the Labyrinth [1], and attention and saliency maps are generated to show the areas highlighted by the agent in the current observation and the areas it attends to from memory. The resiliency of the model was evaluated using saliency and it was discovered that the proposed method is more resilient to the noise of visual observations compared with the baseline model. In summary, the proposed method demonstrates an interpretable and noise-robust DRL approach for long-horizon tasks.
0 Replies
Loading