Multi-objective Reinforcement Learning: A Tool for Pluralistic Alignment

Peter Vamplew; Conor F. Hayes; Cameron Foale; Richard Dazeley; Hadassah Harland

Multi-objective Reinforcement Learning: A Tool for Pluralistic Alignment

Peter Vamplew, Conor F. Hayes, Cameron Foale, Richard Dazeley, Hadassah Harland

Published: 10 Oct 2024, Last Modified: 15 Nov 2024Pluralistic-Alignment 2024EveryoneRevisionsBibTeXCC BY 4.0

Keywords: alignment, pluralistic alignment, multi-objective reinforcement learning, RLHF

TL;DR: This paper provides an overview of the role which multi-objective reinforcement learning can play in creating pluralistically-aligned AI..

Abstract: Reinforcement learning (RL) is a valuable tool for the creation of AI systems. However it may be problematic to adequately align RL based on scalar rewards if there are multiple conflicting values or stakeholders to be considered. Over the last decade multi-objective reinforcement learning (MORL) using vector rewards has emerged as an alternative to standard, scalar RL. This paper provides an overview of the role which MORL can play in creating pluralistically-aligned AI.

Submission Number: 12

Loading