Clean-label backdoor attack and defense: An examination of language model vulnerability

Published: 01 Jan 2025, Last Modified: 27 Sept 2025Expert Syst. Appl. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•We propose a novel clean-label backdoor method using prompts as triggers.•We first explore defense algorithms against backdoor attacks that leverage LoRA.•Our attack method achieves state-of-the-art attack success rates.
Loading