Discriminative apprenticeship learning with both preference and non-preference behavior
Abstract: Considering that expert’s demonstrations are usually
suboptimal and failed demonstrations often have some useful
guidance, in this paper, a Discriminative Apprenticeship Learning
algorithm is proposed, where the apprentice is taught with
the join of failed attempts to acquire the ability that could
discriminate the preference and non-preference cases so that
to actively take a corresponding action. Since robot usually
encounters changing environments, generalization ability is taken
into account in the algorithm through which the reward function
is recovered under the evaluation of generalization error. The
problem of the representation error is also analyzed and involved
in the algorithm. To ensure performance of the algorithm,
theoretical guarantee is presented. Experiments on a simple cardriving
robot and the comparison with a variety of inverse
reinforcement learning methods are performed, which illustrate the proposed method is an effective and promising alternative.
0 Replies
Loading