Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
TreeQN and ATreeC: Differentiable Tree Planning for Deep Reinforcement Learning
Nov 03, 2017 (modified: Dec 12, 2017)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:Combining deep model-free reinforcement learning with on-line planning is a
promising approach to building on the successes of deep RL. On-line planning
with look-ahead trees has proven successful in environments where transition
models are known a priori. However, in complex environments where transition
models need to be learned from data, the deficiencies of learned models have
limited their utility for planning. To address these challenges, we propose
TreeQN, a differentiable, recursive, tree-structured model that serves as a
drop-in replacement for any value function network in deep RL with discrete
actions. TreeQN dynamically constructs a tree by recursively applying a
transition model in a learned abstract state space and then aggregating
predicted rewards and state-values using a tree backup to estimate Q-values. We
also propose ATreeC, an actor-critic variant that augments TreeQN with a softmax
layer to form a stochastic policy network. Both approaches are trained
end-to-end, such that the learned model is optimised for its actual use in the
planner. We show that TreeQN and ATreeC outperform n-step DQN and A2C on a
box-pushing task, as well as n-step DQN and value prediction networks (Oh et
al., 2017) on multiple Atari games, with deeper trees often outperforming
shallower ones. We also present a qualitative analysis that sheds light on the
trees learned by TreeQN.
TL;DR:We present TreeQN and ATreeC, new architectures for deep reinforcement learning in discrete-action domains that integrate differentiable on-line tree planning into the action-value function or policy.
Keywords:reinforcement learning, deep learning, planning
Enter your feedback below and we'll get back to you as soon as possible.