- Abstract: Learning to control a robot by directly applying model-free Reinforcement Learning (RL) is prone to fail due to extreme sample inefficiency. We propose to address this issue by employing several techniques to improve sample complexity. In simulation we employ reward shaping, multi-task learning, and apprenticeship learning. To transfer the learned policy to the real robot we use domain randomization techniques to improve the robustness of the learned policy. In subsequent phases we plan to use learned domain randomization to target performance on the real system rather than robustness.