Benchmarking Multi-National Value Alignment for Large Language Models

Benchmarking Multi-National Value Alignment for Large Language Models

ACL ARR 2024 December Submission868 Authors

15 Dec 2024 (modified: 16 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Do Large Language Models (LLMs) hold positions that conflict with your country's values? In this paper, we introduce NaVAB, a comprehensive benchmark designed to evaluate the alignment of LLMs with the values of five major nations: China, the United States, the United Kingdom, France, and Germany. Existing benchmarks, which rely on spectrum tests conducted through questionnaires, often fail to capture the dynamic nature of values across countries and lack in suffcient evaluation data. To address these limitations, NaVAB implements a value data extraction pipeline to efficiently construct value assessment datasets. This process includes a Conflict Reduction mechanism to filter non-conflicting values for a high-quality benchmark. Through extensive experiments on various LLMs (spanning Base vs. Instruct models, non-MoE vs. MoE architectures and Open vs. Closed source), we demonstrate that LLMs can be effectively aligned with the multi-national values by NaVAB.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: benchmarking, evaluation, NLP datasets

Contribution Types: Data resources

Languages Studied: English,Chinese,French,German

Submission Number: 868

Loading