Exploiting Fine-Tuning Structures to Improve Adversarial Transferability on Downstream SAM

Exploiting Fine-Tuning Structures to Improve Adversarial Transferability on Downstream SAM

ICLR 2026 Conference Submission9673 Authors

17 Sept 2025 (modified: 29 Nov 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: segment anything model, adversarial attack, attack transferability

Abstract: Combining the Segment Anything Model (SAM) with fine-tuning techniques allows SAM to be effectively adapted to various downstream image segmentation tasks. However, this adaptability introduces new security vulnerabilities related to adversarial attacks. In this paper, we investigate the adversarial transferability between the original SAM and its fine-tuned downstream models. Under limited knowledge conditions of the downstream models, we propose a novel structure-exploiting transferable attack (SETA) method. Our framework mimics the fine-tuning architecture and estimates the parameter distributions of the downstream models to improve the transferability of the generated adversarial samples. Experimental results demonstrate the efficacy of our proposed method in creating adversarial examples against various downstream fine-tuned SAM models.

Primary Area: other topics in machine learning (i.e., none of the above)

Submission Number: 9673

Loading