An Adversarial Attack Framework via Decision Boundary Drift for Continual Learning

An Adversarial Attack Framework via Decision Boundary Drift for Continual Learning

ICLR 2026 Conference Submission17622 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Continual Learning, Adversarial Attack

Abstract: Continual learning (CL) is widely used in open environments due to its dynamic adaptive ability. However, our further analysis reveals that due to the weight drift phenomenon occurred in the parameters update stage, the model is under serious risk of adversarial attacks. To tackle this issue, we propose an Adversarial Attack framework based on Decision Boundary drift (AADB). It includes: (1) an adversarial sample generation method based on the decision boundary drift phenomenon, which significantly reduces the model classification accuracy to 4.41\%; (2) a composite loss function based on similarity loss and adversarial loss to optimize adversarial samples, which can reduce the classification accuracy of adversarial samples without significantly affecting the quality of adversarial samples; (3) an adversarial sample attack method that distorts the decision boundary of model by mixing adversarial samples and normal samples, affecting the model performance; (4) a defense framework based on dynamic feature consistency, cross-category comparison learning and resilient rejection mechanism, which can suppress the deformation of decision boundary caused by adversarial perturbation and improve the rejection rate of adversarial samples to 40.16\%. Experiments on CIFAR-100, Mini-ImageNet, and other datasets prove that the adversarial attack framework has significant effects and provides an algorithmic foundation for the subsequent exploration of the field of continual learning lightweight defense and adaptive attack detection mechanism.

Primary Area: interpretability and explainable AI

Submission Number: 17622

Loading