One Shot, One Kill: Attacking Video Object Segmentation with a Single Frame

16 Sept 2025 (modified: 12 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Backdoor Attack, Video Object Segmentation, One-Shot
TL;DR: We reveal the backdoor threat in video object segmentation for the first time and propose a simple yet effective one-shot backdoor attack named OSBA.
Abstract: Video object segmentation (VOS) plays a pivotal role in numerous critical applications, including autonomous systems and video surveillance. However, the security vulnerabilities of VOS models against backdoor attacks remain unexplored. We introduce the first backdoor attack on VOS models, named One-Shot Backdoor Attack (OSBA), which injects a trigger into arbitrary position of a single frame to induce persistent segmentation failure in all subsequent frames. Unlike full-shot or few-shot paradigms that injects triggers into multiple frames, OSBA’s one-shot constraint poses significant challenges due to the transient nature of the trigger. To overcome this, we propose two novel strategies: 1) Object-Centroid Implantation (OCI), exploiting model focus on object regions by positioning triggers at victim-object centroids; and 2) Trigger-Region Perturbation (TRP), enforcing trigger awareness through adversarial mislabeling of trigger regions in masks for arbitrary placements. Extensive experiments demonstrate that OSBA drastically degrades segmentation performance (<20% J&F) across VOS models with minimal training data poisoning (1%). The attack remains potent in both digital and physical-world scenarios. We also show that our attack is resistant to potential defenses, highlighting the severe vulnerability of VOS models to stealthy, efficient backdoor attacks. Code will be made available.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 7658
Loading