Keywords: Adversarial Attacks & Robustness, Segmentation, Vision for Robotics & Autonomous Driving, Adversarial Learning & Robustness
Abstract: Semantic segmentation models have been widely adopted in various domains, including safety-critical applications such as autonomous driving, where they play a pivotal role in enabling accurate scene understanding and decision-making. However, despite their utility and prevalence, these models remain vulnerable to targeted adversarial perturbations, which can compromise their reliability in real-world deployments. To highlight this challenge, we propose a two-stage attack framework that crafts structured perturbations confined to the central vehicle region, inducing misclassifications while preserving background semantics. First, we generate a pseudo-ground-truth segmentation by inpainting the detected vehicle mask within the central third of the image, enabling the attacker to anticipate the model’s response if the target class were absent. Second, we optimize an ℓ∞-bounded perturbation via a hybrid loss combining mean-squared error to the pseudo-ground-truth, total variation regularization for spatial coherence, and a class-wise IoU loss to degrade segmentation across all non-target classes. Finally, we refine the attack using region-specific cross-entropy losses to simultaneously mislead vehicle pixels toward surrounding classes and maintain background consistency. Evaluated on the Cityscapes dataset, our attack achieves over 92% predicted mask accuracy for the target zone and over 93% background preservation elsewhere with Segformer model. These results demonstrate that spatial constraints cannot prevent powerful region-based attacks, underscoring the urgent need for robust defense strategies.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 18822
Loading