Abstract: With recent advancements in deep learning and increased computational power, there has been growing interest in applying deep learning not only to virtual environments such as game tasks, but also to real environment settings where robots with multiple degrees of freedom operate. However, the application of deep learning in real environments remains challenging due to the large number of samples required to learn effective policies for given tasks. In this paper, we propose a method for acquiring policies that generalize to environments not encountered during training by synthesizing individual models learned in simulation within a shared parameter space. Specifically, we represent policies using a combination of multiple base parameters and associated base weights, and enable adaptation to new environments by modulating these base weights. We evaluated the proposed method in both a quadruped robot model and a real robot. In simulation, we tested across environments with varying floor friction coefficients and confirmed that the method is robust to changes in base weights. In real experiments, we demonstrated that the policy can be successfully adapted to the physical environment with only a small number of additional learning trials.
External IDs:doi:10.1007/978-981-95-4445-5_25
Loading