CELM: A Dataset for Chinese Ethico-Legal Alignment in Large Language Models

ACL ARR 2024 June Submission4385 Authors

16 Jun 2024 (modified: 03 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Existing Chinese datasets for aligning large language models (LLMs) with human preferences often reflect U.S.-centric values due to their annotation process, reducing their effectiveness for developing safe and culturally appropriate LLMs for China, one of the largest LLM markets in the world. In this work, we introduce ``CELM'', a comprehensive Chinese-centric dataset, for i) training LLMs with the Chinese module aligning with corresponding societal values and ii) assessing their safety in the Chinese context. This dataset includes 17 important scenarios, three of which are unique to China. It includes 1,337 instances innovatively annotated with Chinese legal and ethical norms for fine-tuning, and 46,633 instances judged according to the safety preference of native Chinese crowdworkers for reinforcement learning. It includes 2,111 evaluation examples produced using human-in-the-loop red teaming to rigorously examine the safety levels of LLMs in the Chinese cultural context. Our studies show that models trained on CELM produce safer and more culturally relevant responses for China than those trained on datasets biased towards U.S. norms.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: large language model, alignment, dataset, Chinese culture
Contribution Types: Data resources
Languages Studied: Chinese
Submission Number: 4385
Loading