Keywords: Reinforcement Learning, Cross-Domain Policy Transfer, Sim-to-Real, Hybrid Offline-and-Online RL
TL;DR: H2O+ offers great flexibility to bridge various choices of offline and online learning methods while also accounting for dynamics gaps between the real and simulation environments.
Abstract: Solving real-world complex tasks using reinforcement
learning (RL) without high-fidelity simulation environments
or large amounts of offline data can be quite challenging.
Online RL agents trained in imperfect simulation
environments can suffer from severe sim-to-real issues.
Offline RL approaches although bypass the need for
simulators, often pose demanding requirements on the size
and quality of the offline datasets. The recently emerged
hybrid offline-and-online RL provides an attractive
framework that enables joint use of limited offline data
and imperfect simulator for transferable policy learning.
In this paper, we develop a new algorithm, called H2O+,
which offers great flexibility to bridge various choices of
offline and online learning methods, while also accounting
for dynamics gaps between the real and simulation
environment. Through extensive simulation and real-world
robotics experiments, we demonstrate superior performance
and flexibility over advanced cross-domain online and
offline RL algorithms.
Primary Subject Area: Other
Paper Type: Research paper: up to 8 pages
DMLR For Good Track: Participate in DMLR for Good Track
Participation Mode: In-person
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 11
Loading