Cross-Domain Reinforcement Learning Under Distinct State-Action Spaces Via Hybrid Q Functions

Kuan-Chen Pan; MingHong Chen; You-De Huang; Xi Liu; Ping-Chun Hsieh

Cross-Domain Reinforcement Learning Under Distinct State-Action Spaces Via Hybrid Q Functions

Kuan-Chen Pan, MingHong Chen, You-De Huang, Xi Liu, Ping-Chun Hsieh

26 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Cross-domain transfer; Transfer learning; Reinforcement learning

Abstract: Cross-domain reinforcement learning (CDRL) is meant to improve the data efficiency of RL by leveraging the data samples collected from a source domain to facilitate the learning in a similar target domain. Despite its potential, cross-domain transfer in RL is known to have two fundamental and intertwined challenges: (i) The source and target domains can have distinct state space or action space, and this makes direct transfer infeasible and thereby requires more sophisticated interdomain mappings; (ii) The domain similarity in RL is not easily identifiable a priori, and hence CDRL can be prone to negative transfer. In this paper, we propose to jointly tackle these two challenges through the lens of hybrid Q functions. Specifically, we propose QAvatar, which combines the Q functions from both the source and target domains with a proper weight decay function. Through this design, we characterize the convergence behavior of QAvatar and thereby show that QAvatar achieves reliable transfer in the sense that it effectively leverages a source-domain Q function for knowledge transfer to the target domain. Through extensive experiments, we demonstrate that QAvatar achieves superior transferability across domains on a variety of RL benchmark tasks, such as locomotion and robot arm manipulation, even in the scenarios of potential negative transfer.

Supplementary Material: zip

Primary Area: reinforcement learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6552

Loading