Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Options Discovery with Budgeted Reinforcement Learning
Aurelia Léon, Ludovic Denoyer
Nov 04, 2016 (modified: Jan 19, 2017)ICLR 2017 conference submissionreaders: everyone
Abstract:We consider the problem of learning hierarchical policies for Reinforcement Learning able to discover options, an option corresponding to a sub-policy over a set of primitive actions. Different models have been proposed during the last decade that usually rely on a predefined set of options. We specifically address the problem of automatically discovering options in decision processes. We describe a new RL learning framework called Bi-POMDP, and a new learning model called Budgeted Option Neural Network (BONN) able to discover options based on a budgeted learning objective. Since Bi-POMDP are more general than POMDP, our model can also be used to discover options for classical RL tasks. The BONN model is evaluated on different classical RL problems, demonstrating both quantitative and qualitative interesting results.
TL;DR:The article describes a new learning model called Budgeted Option Neural Network (BONN) able to discover options based on a budgeted learning objective, and a new RL learning framework called Bi-POMDP.
Enter your feedback below and we'll get back to you as soon as possible.