Keywords: Constitutional AI, AI Alignment, Rule Ambiguity, AI Law and Policy
Abstract: AI systems are increasingly governed by natural language principles, yet a key challenge arising from reliance on language remains under-explored: interpretive ambiguity. Ambiguity arises both from how these principles are written and how they are applied. While legal systems use institutional safeguards to manage such ambiguity, comparable protections from AI alignment pipelines are often missing. Different interpretations of the same rule can lead to inconsistent or unstable model behavior. We identify key gaps in current alignment pipelines, drawing on how legal systems constrain ambiguity at both the rule creation and rule application steps. We then propose a computational framework: (1) a rule refinement pipeline that minimizes interpretive disagreement by revising ambiguous rules, and (2) prompt-based interpretive constraints that reduce inconsistency in rule application. We evaluate our framework on a 5,000-scenario subset of the WildChat dataset and show that both interventions significantly improve judgment consistency across a panel of reasonable interpreters. Our approach offers a first step toward systematically managing interpretive ambiguity, an essential step for building more robust, rule-following AI systems.
Submission Number: 89
Loading