Keywords: Cross-domain transfer; Transfer learning; Reinforcement learning
Abstract: Cross-domain reinforcement learning (CDRL) is meant to improve the data efficiency of RL by leveraging the data samples collected from a source domain to facilitate the learning in a similar target domain. Despite its potential, cross-domain transfer in RL is known to have two fundamental and intertwined challenges: (i) The source and target domains can have distinct representations (either in states or actions), and this makes direct transfer infeasible and thereby requires sophisticated inter-domain mappings; (ii) The domain similarity in RL is not easily identifiable a priori, and hence CDRL can be prone to negative transfer.
In this paper, we propose to jointly tackle these two challenges through the lens of hybrid Q functions. Specifically, we propose $Q$Avatar, which combines the Q functions from both the source and target domains with a proper weight decay function. Through this design, we characterize the convergence behavior of $Q$Avatar
and thereby show that $Q$Avatar achieves robust transfer in the sense that it effectively leverages a source-domain Q function for knowledge transfer to the target domain, regardless of the quality of the source-domain model and domain similarity.
Through extensive experiments, we demonstrate that $Q$Avatar achieves superior transferability across domains on a variety of RL benchmark tasks, including locomotion and robot arm manipulation, even in the scenarios of potential negative transfer.
Submission Number: 62
Loading