Fair-SP: Capturing Pluralistic Social Equity Preferences Through Synthetic Data

20 Sept 2025 (modified: 06 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Human Preference, LLM alignment
Abstract: Human preference plays a crucial role in understanding social values and developing inclusive AI systems. However, collecting comprehensive human preference feedback is costly, and most existing datasets neglect the pluralism of social segment preferences, particularly in social equity domains. To address this gap, we introduce FAIR-SP, a synthetic dataset capturing pluralistic social segment preferences on equity issues, systematically constructed with theoretical guidance from multiple disciplines including sociology and philosophy. FAIR-SP encompasses 28 social groups, 98 equity topics, and 5 preference dimensions. Through automatic question generation mechanisms, it provides both concise template-based and narrative-driven contextualized scenario questions, yielding 238,623 preference records via GPT-4o-mini role-playing based on seven representative UK public segments, with extensions to other regional contexts. We validate the dataset quality through multiple complementary approaches, achieving over 90% role-play fidelity and human evaluation scores exceeding 0.7. We demonstrate the dataset utility through targeted equity preference alignment experiments and equity positioning analysis of LLMs across global regions. FAIR-SP establishes a foundational resource for understanding and incorporating pluralistic social values especially in the era of LLMs.
Primary Area: datasets and benchmarks
Submission Number: 24316
Loading