User-Level Safety Alignment

ACL ARR 2025 February Submission3630 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Current safety alignment methods often apply a one-size-fits-all approach, overlooking the unique needs of different users. It limits the effectiveness of using Large Language Models (LLMs) for particular professions in their work or research. To overcome this issue, we introduce a novel task called User-Level Safety Alignment, which requires LLMs to customize their safety alignment to match specific roles, providing tailored responses accordingly. Complementing this task, we have developed a large-scale User-Level Safety Alignment dataset, specifically designed to train and evaluate models in role-based safety. Our experiments show that our dataset significantly enhances the model's ability to provide safe, reliable, and tailored responses, paving the way for LLMs that are not only more robust but also more attuned to the diverse needs of users.
Paper Type: Short
Research Area: NLP Applications
Research Area Keywords: Trustworthy LLM,Safety Alignment,QA datasets
Contribution Types: NLP engineering experiment, Data resources, Data analysis
Languages Studied: English
Submission Number: 3630
Loading