Inverse Optimal Control with Linearly-Solvable MDPs

Krishnamurthy Dvijotham, Emanuel Todorov

2010 (modified: 16 Jul 2019)ICML 2010Readers: Everyone

Abstract: We present new algorithms for inverse optimal control (or inverse reinforcement learning, IRL) within the framework of linearly-solvable MDPs (LMDPs). Unlike most prior IRL algorithms which recover only the control policy of the expert, we recover the policy, the value function and the cost function. This is possible because here the cost and value functions are uniquely defined given the policy. Despite these special properties, we can handle a wide variety of problems such as the grid worlds popular in RL and most of the nonlinear problems arising in robotics and control engineering. Direct comparisons to prior IRL algorithms show that our new algorithms provide more information and are orders of magnitude faster. Indeed our fastest algorithm is the first inverse algorithm which does not require solving the forward problem; instead it performs unconstrained optimization of a convex and easy-to-compute log-likelihood. Our work also sheds light on the recent Maximum Entropy (MaxEntIRL) algorithm, which was defined in terms of density estimation and the corresponding forward problem was left unspecified. We show that MaxEntIRL is inverting an LMDP, using the less efficient of the algorithms derived here. Unlike all prior IRL algorithms which assume pre-existing features, we study feature adaptation and show that such adaptation is essential in continuous state spaces.

0 Replies