A Constrained Multi-Objective Reinforcement Learning Framework

Sandy Huang; Abbas Abdolmaleki; Giulia Vezzani; Philemon Brakel; Daniel J Mankowitz; Michael Neunert; Steven Bohez; Yuval Tassa; Nicolas Heess; Martin Riedmiller; raia hadsell

A Constrained Multi-Objective Reinforcement Learning Framework

Sandy Huang, Abbas Abdolmaleki, Giulia Vezzani, Philemon Brakel, Daniel J Mankowitz, Michael Neunert, Steven Bohez, Yuval Tassa, Nicolas Heess, Martin Riedmiller, raia hadsell

Published: 13 Sept 2021, Last Modified: 05 May 2023CoRL2021 PosterReaders: Everyone

Keywords: constrained RL, multi-objective RL, deep RL

Abstract: Many real-world problems, especially in robotics, require that reinforcement learning (RL) agents learn policies that not only maximize an environment reward, but also satisfy constraints. We propose a high-level framework for solving such problems, that treats the environment reward and costs as separate objectives, and learns what preference over objectives the policy should optimize for in order to meet the constraints. We call this Learning Preferences and Policies in Parallel (LP3). By making different choices for how to learn the preference and how to optimize for the policy given the preference, we can obtain existing approaches (e.g., Lagrangian relaxation) and derive novel approaches that lead to better performance. One of these is an algorithm that learns a set of constraint-satisfying policies, useful for when we do not know the exact constraint a priori.

Supplementary Material: zip

Poster: png

17 Replies

Loading