Abstract: Do Large Language Models (LLMs) hold positions that conflict with your country's values? In this paper, we introduce NaVAB, a comprehensive benchmark designed to evaluate the alignment of LLMs with the values of five major nations: China, the United States, the United Kingdom, France, and Germany. Existing benchmarks, which rely on spectrum tests conducted through questionnaires, often fail to capture the dynamic nature of values across countries and lack in suffcient evaluation data. To address these limitations, NaVAB implements a value data extraction pipeline to efficiently construct value assessment datasets. This process includes a Conflict Reduction mechanism to filter non-conflicting values for a high-quality benchmark. Through extensive experiments on various LLMs (spanning Base vs. Instruct models, non-MoE vs. MoE architectures and Open vs. Closed source), we demonstrate that LLMs can be effectively aligned with the multi-national values by NaVAB.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: benchmarking, evaluation, NLP datasets
Contribution Types: Data resources
Languages Studied: English,Chinese,French,German
Submission Number: 868
Loading