Policy Dreamer: Diverse Public Policy Generation Via Elicitation and Simulation of Human Preferences

Arjun Karanam; José Ramón Enríquez; Udari Madhushani Sehwag; Michael Elabd; Kanishk Gandhi; Noah Goodman; Sanmi Koyejo

Policy Dreamer: Diverse Public Policy Generation Via Elicitation and Simulation of Human Preferences

Arjun Karanam, José Ramón Enríquez, Udari Madhushani Sehwag, Michael Elabd, Kanishk Gandhi, Noah Goodman, Sanmi Koyejo

Published: 09 Oct 2024, Last Modified: 04 Dec 2024SoLaR PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Technical

Keywords: Diverse Policy Generation, AI-Assisted Policymaking, Pluralistic Alignment, Language Model Applications

TL;DR: Policy Dreamer: AI framework for socially responsible policy generation. Uses LLMs to create diverse, constituency-aligned proposals across domains. Explores AI's potential for enhancing democratic policymaking.

Abstract: Developing public policies that effectively address complex societal issues while representing diverse perspectives remains a significant challenge in governance and policy-making. This paper presents Policy Dreamer, an evolutionary dynamics-based preference aggregation method designed to create public policy that aligns with heterogeneous populations while preserving solution diversity. It does so in three stages: a) Initial Public Policy Generation (where public policies are defined as a set of goals, actions, and strategies aimed at addressing specific societal issues), b) Preference Elicitation from a constituency of humans, and c) Policy Refinement using simulated human feedback. We apply this approach to the domain of creating public policy, which require navigating complex socioeconomic trade-offs. To validate our method, we measure our system's ability to create popular yet diverse policy proposals in the following domains: Healthcare, Gun Control, and Social Media regulation. Our approach iteratively aligns policies with respect to a base constituency, while using evolutionary search to ensure that policy diversity is not compromised. When compared to an expert-crafted set of policies, it is able to generate novel policies, with up to 25\% of generated policies being novel. However, it exhibits limitations in capturing the full diversity of these expert-crafted policies, particularly in controversial or emerging policy domains. Overall, our preliminary results suggest that Large Language Models (LLMs) are able to actively elicit preferences from a constituency of people, and iteratively generate statements (public policies) that align with this constituency while preventing a collapse in statement diversity.

Submission Number: 99

Loading