Toggle navigation
OpenReview
.net
Login
×
Go to
DBLP
homepage
Probe before You Talk: Towards Black-box Defense against Backdoor Unalignment for Large Language Models
Biao Yi
,
Tiansheng Huang
,
Sishuo Chen
,
Tong Li
,
Zheli Liu
,
Zhixuan Chu
,
Yiming Li
Published: 01 Jan 2025, Last Modified: 16 May 2025
ICLR 2025
Everyone
Revisions
BibTeX
CC BY-SA 4.0
Loading