Foundational Challenges in Assuring Alignment and Safety of Large Language Models

TMLR Paper2632 Authors

06 May 2024 (modified: 13 May 2024)Under review for TMLREveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose 200+, concrete research questions.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Greg_Durrett1
Submission Number: 2632
Loading