Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning

Hidenori Itaya; Tsubasa Hirakawa; Takayoshi Yamashita; Hironobu Fujiyoshi; Komei Sugiura

Visual Explanation using Attention Mechanism in Actor-Critic-based Deep Reinforcement Learning

Hidenori Itaya, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi, Komei Sugiura

28 Sept 2020 (modified: 22 Jun 2025)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Abstract: Deep reinforcement learning (DRL) has great potential for acquiring the optimal action in complex environments such as games and robot control. However, it is difficult to analyze the decision-making of the agent, i.e., the reasons it selects the action acquired by learning. In this work, we propose Mask-Attention A3C (Mask A3C) that introduced an attention mechanism into Asynchronous Advantage Actor-Critic (A3C) which is an actor-critic-based DRL method, and can analyze decision making of agent in DRL. A3C consists of a feature extractor that extracts features from an image, a policy branch that outputs the policy, value branch that outputs the state value. In our method, we focus on the policy branch and value branch and introduce an attention mechanism to each. In the attention mechanism, mask processing is performed on the feature maps of each branch using mask-attention that expresses the judgment reason for the policy and state value with a heat map. We visualized mask-attention maps for games on the Atari 2600 and found we could easily analyze the reasons behind an agent’s decision-making in various game tasks. Furthermore, experimental results showed that higher performance of the agent could be achieved by introducing the attention mechanism.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/visual-explanation-using-attention-mechanism/code)

Reviewed Version (pdf): https://openreview.net/references/pdf?id=-FYEdfq2L

13 Replies

Loading