Learn What Not to Learn: Action Elimination with Deep Reinforcement LearningDownload PDF

Jun 11, 2018 (edited Jun 13, 2018)ICML 2018 ECA SubmissionReaders: Everyone
  • Keywords: DRL, NLP, Action Elimination
  • TL;DR: Learning control policies from text, when there are thousands of actions to choose from, by learning how not to act
  • Abstract: Learning how to act when there are many available actions in each state is a challenging task for Reinforcement Learning (RL) agents, especially when many of the actions are redundant or irrelevant. In such cases, it is easier to learn which actions \textbf{not} to take. In this work, we propose the Action-Elimination Deep Q-Network (AE-DQN) architecture that combines a Deep RL algorithm with an Action Elimination Network (AEN) that eliminates sub-optimal actions. The AEN is trained to predict invalid actions, supervised by an external elimination signal provided by the environment. Simulations demonstrate a considerable speedup \textbf{and} added robustness over vanilla DQN in text-based games with over a thousand discrete actions.
2 Replies