Abstract: Physics-based reinforcement learning tasks can benefit from simplified physics simulators as they potentially allow near-optimal policies to be learned in simulation. However, such simulators require the latent factors (e.g. mass, friction coefficient etc.,) of the associated objects and other environment-specific factors (e.g. wind speed, air density etc.,) to be accurately specified. As such a complete specification can be impractical, in this paper, we instead, focus on learning task-specific estimates of latent factors which allow the approximation of real world trajectories in an ideal simulation environment. Specifically, we propose two new concepts: a) action grouping - the idea that certain types of actions are closely associated with the estimation of certain latent factors, and; b) partial grounding - the idea that simulation of task-specific dynamics may not need precise estimation of all the latent factors. We demonstrate our approach in a range of physics-based tasks, and show that it achieves superior performance relative to other baselines, using only a limited number of real-world interactions.
0 Replies
Loading