Do not write that jailbreak paper

Published: 23 Jan 2025, Last Modified: 26 Feb 2025ICLR 2025 Blogpost TrackEveryoneRevisionsBibTeXCC BY 4.0
Blogpost Url: https://d2jud02ci9yv69.cloudfront.net/2025-04-28-do-not-write-jaibreak-papers-57/blog/do-not-write-jaibreak-papers/
Abstract: Jailbreaks are becoming a new ImageNet competition instead of helping us better understand LLM security. This blogpost surveys the jailbreak literature to extract the most important contributions and encourages the community to revisit their choices and focus on research that can uncover new security vulnerabilities.
Conflict Of Interest: N/A
Submission Number: 31
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview