Latent-space adversarial training with post-aware calibration for defending large language models against jailbreak attacks

Published: 2026, Last Modified: 25 Jan 2026Expert Syst. Appl. 2026EveryoneRevisionsBibTeXCC BY-SA 4.0
Loading