Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
AGD: Adversarial Game Defense Against Jailbreak Attacks in Large Language Models
Shilong Pan
,
Zhiliang Tian
,
Zhen Huang
,
Wanlong Yu
,
Zhihua Wen
,
Xinwang Liu
,
Kai Lu
,
Minlie Huang
,
Dongsheng Li
Published: 01 Jan 2025, Last Modified: 25 Jul 2025
ACL (1) 2025
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading