Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI

Published: 10 Oct 2024, Last Modified: 15 Nov 2024Pluralistic-Alignment 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: alignment, pluralistic alignment, multi-objective reinforcement learning
TL;DR: This paper describes an approach for adapting to user preferences through Multi-Objective Reinforcement Learning (MORL)
Abstract: Emerging research in Pluralistic AI alignment seeks to address how to design and deploy intelligent systems in accordance with diverse human needs and values.We contribute a potential approach for aligning AI with diverse and shifting user preferences through Multi-Objective Reinforcement Learning (MORL), via post-learning policy selection adjustment. This paper introduces the proposed framework, outlines its anticipated advantages and assumptions, and discusses technical details for implementation. We also examine the broader implications of adopting a retroactive alignment approach from a sociotechnical systems perspective.
Submission Number: 18
Loading