Abstract: Reinforcement learning has been widely used in many applications (e.g., self-driving cars, games, and robots). However, it is not very efficient when there are many actions and states. Thus, many researchers have applied Deep learning to it and call it Deep reinforcement learning (DRL) to address such a drawback. Unfortunately, recently, some researchers discovered a weakness of Deep learning in the test time and called it an adversarial attack. Since DRL is also Deep learning, it has the same weakness. Hence, DRL is vulnerable to adversarial attacks. Moreover, some existing works have created adversarial attacks for DRL. Generally, they first decide whether to create an adversarial example for the current state and then determine how much perturbation to add to the current state. This limits the adversary from the information in the previous steps when determining the perturbation. Also, some of the attacks fixed the adversarial action to the worst action, and then, the behavior of the target agent did not look natural. Therefore, we propose combining those two problems into one problem to allow the adversary to receive the information from the previous steps and let the adversary pick an adversarial action that is worth adding the perturbation by formulating and optimizing the problem. At last, we construct the experiment on Atari games to investigate the behavior of the agent attacked by our approach and compare our approach to the state-of-the-art attacks in terms of the amount of added perturbation and reward. As a result, our approach can make an agent play the games as if there was no adversarial attack and outperform the previous works.
0 Replies
Loading