Keywords: Pluralistic Alignment, Factual Pluralism
TL;DR: Factual claims are an underexplored category in pluralistic alignment research, we introduce dataset based on wikidata and test four llms, none do well.
Abstract: Pluralistic alignment seeks to build AI systems that represent the range of legitimate human values rather than collapsing them into a single averaged answer. However, current work in this area is predominantly focused on subjective and preference-based tasks. But pluralistic concerns extend to factual questions as well: a factual question that appears to have one correct answer may in fact have different legitimate answers, depending on which source or authority one recognizes. Models that assert a single value with confidence, rather than acknowledging such plurality, risk the same representational failures that motivate pluralistic alignment in subjective settings. To address this gap, we introduce an early benchmark for factual pluralism and evaluate LLMs on whether they acknowledge dispute. We find that none of the models tested acknowledge dispute reliably, and that reasoning mode tends to reduce rather than improve performance.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 147
Loading