Scopes of Alignment

Published: 02 Jan 2025, Last Modified: 03 Mar 2025AAAI 2025 Workshop AIGOV OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: alignment, values
TL;DR: We motivate why we need to move beyond context-less and generic values of helpfulness, harmlessness, and honesty.
Abstract:

Much of the research focus on AI alignment seeks to align large language models and other foundation models to the context-less and generic values of helpfulness, harmlessness, and honesty. Frontier model providers also strive to align their models with these values. In this paper, we motivate why we need to move beyond such a limited conception and propose three dimensions for doing so. The first scope of alignment is competence: knowledge, skills, or behaviors the model must possess to be useful for its intended purpose. The second scope of alignment is transience: either semantic or episodic depending on the context of use. The third scope of alignment is audience: either mass, public, small-group, or dyadic. At the end of the paper, we use the proposed framework to position some technologies and workflows that go beyond prevailing notions of alignment.

Submission Number: 9
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview