Lipschitz Lifelong Reinforcement Learning

Erwan Lecarpentier; David Abel; Kavosh Asadi; Yuu Jinnai; Emmanuel Rachelson; Michael L. Littman

Lipschitz Lifelong Reinforcement Learning

Erwan Lecarpentier, David Abel, Kavosh Asadi, Yuu Jinnai, Emmanuel Rachelson, Michael L. Littman

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: Reinforcement Learning, Lifelong Learning

TL;DR: We analyze theoretically how the optimal value function changes across tasks and derive a method for non-negative transfer of value functions in Lifelong Reinforcement Learning.

Abstract: We consider the problem of reusing prior experience when an agent is facing a series of Reinforcement Learning (RL) tasks. We introduce a novel metric between Markov Decision Processes and focus on the study and exploitation of the optimal value function's Lipschitz continuity in the task space with respect to that metric. These theoretical results lead us to a value transfer method for Lifelong RL, which we use to build a PAC-MDP algorithm that exploits continuity to accelerate learning. We illustrate the benefits of the method in Lifelong RL experiments.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/lipschitz-lifelong-reinforcement-learning/code)

Original Pdf: pdf

9 Replies

Loading