GUARD: Guideline Upholding Test through Adaptive Role-play and Jailbreak Diagnostics for LLMs.

Haibo Jin, Ruoxi Chen, Peiyan Zhang, Andy Zhou, Yang Zhang, Haohan Wang

11 Jan 2026CoRR 2025EveryoneCC BY-SA 4.0
Loading