BTUAP: Boosting the Transferability of Universal Adversarial Perturbations in the Black-box Setting under Various Data Dependencies

Jie Wan

Published: 30 Oct 2025, Last Modified: 28 Jul 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Deep neural networks (DNNs) are susceptible to Universal Adversarial Perturbation (UAP), which significantly increases the likelihood of deceiving DNNs. Current UAP generation methods are categorized into data-dependent and data-free attacks based on different data dependencies on the training data. However, both strategies exhibit poor transferability in the black-box settings. To address this limitation, we propose BTUAP, a novel UAP generation method designed to enhance the transferability of UAP in the black-box setting. BTUAP employs an ensemble strategy with min-max weight adjustment mechanisms to reduce the impact of model characteristics and introduces a self-supervised optimization strategy to maximize the distance of predicted logits between benign and adversarial samples. Experimental results demonstrate that BTUAP significantly improves transferability in different data dependency settings under black-box constraints. We also quantify the impact of the distribution shift and provide a new metric to measure the robustness of models. The source code will be available after the paper is published.