Diverse Offline Imitation via Fenchel DualityDownload PDF

Published: 20 Jul 2023, Last Modified: 31 Aug 2023EWRL16Readers: Everyone
Keywords: offline RL, Fenchel duality, skill discovery, imitation
TL;DR: We propose a Fenchel dual approach to constrained offline diverse policy extraction.
Abstract: There has been significant recent progress in the area of unsupervised skill discovery, with various works proposing mutual information based objectives, as a source of intrinsic motivation. Prior works predominantly focused on designing algorithms that require online access to the environment. In contrast, we develop an offline skill discovery algorithm. Our problem formulation considers the maximization of a mutual information objective constrained by a KL-divergence. More precisely, the constraints ensure that the state occupancy of each skill remains close to the state occupancy of an expert, within the support of an offline dataset with good state-action coverage. Our main contribution is to connect Fenchel-Rockafellar duality, reinforcement learning and unsupervised skill discovery, and to give a simple offline algorithm for learning diverse skills that are aligned with an expert.
1 Reply

Loading