Plurality of value pluralism and AI value alignment

Published: 10 Oct 2024, Last Modified: 15 Nov 2024Pluralistic-Alignment 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Value Pluralism, Philosophy, Value Alignment, Collective Alignment, Democratic Alignment
TL;DR: I develop a two-level framework for AI value alignment that addresses both implementing values (first-order choices) and determining who legitimately makes those decisions (second-order choices) to achieve meaningful pluralism.
Abstract: AI value alignment efforts increasingly emphasize value pluralism, but implementing value pluralism itself involves contested choices. This paper introduces a two-level framework distinguishing between first-order value choices (implementing specific accounts of values) and second-order value choices (determining the legitimacy of these first-order value selections and implementations). I argue that genuine pluralistic value alignment requires explicit engagement with both levels. While first-order choices involve defining and measuring values, second-order choices address who has legitimate authority to make first-order value decisions and through what processes. The framework yields two critical insights by decomposing value pluralism into distinct components. First, it helps prevent "pluralistic value-washing'' where superficial appeals to insignificant pluralism could mask fundamentally monistic alignment approaches. Second, it reveals that there is no single "correct" implementation of value pluralism --- attempts to converge on "pluralism'' as a universal good approach fundamentally contradict pluralistic principles themselves. To enable more meaningful tracking of pluralistic value alignment in both single and multi-agent AI systems, I propose developing "value cards" based on the components of this normative framework.
Submission Number: 36
Loading