Towards Sample-Efficient Multi-Objective Reinforcement LearningOpen Website

Published: 2023, Last Modified: 29 Sept 2023AAMAS 2023Readers: Everyone
Abstract: In sequential decision-making problems, the objective that a reinforcement learning agent seeks to optimize is often modeled via a reward function. However, in real-world problems, agents often have to optimize multiple (possibly conflicting) objectives. This setting is known as multi-objective reinforcement learning (MORL). In MORL, the goal of the agent is not to learn a single policy, but a set of policies, each of which specialized in optimizing a single objective or a combination of objectives. In my Ph.D., I investigate methods that allow the agent to learn a carefully-constructed set of policies that can be combined to solve challenging MORL problems in a sample-efficient manner. In this paper, I present a brief overview of my work on this topic and focus on two main contributions: (i) a novel algorithm for optimal policy transfer based on theoretical equivalences between successor features and MORL; and (ii) a novel MORL algorithm based on generalized policy improvement that learns a set of policies that is guaranteed to contain an optimal policy for any possible agent's preferences over objectives.
0 Replies

Loading