Abstract: When applying high-level visual algorithms to rainy scenes, it is customary to preprocess the rainy images using low-level rain removal networks, followed by visual networks to achieve the desired objectives. Such a setting has never been explored by adversarial attack methods, which are only limited to attacking one kind of them. Considering the deficiency of multi-functional attacking strategies and the significance for open-world perception scenarios, we are the first to propose a Cascaded Adversarial Attack (CAA) setting, where the adversarial example can simultaneously attack different-level tasks, such as rain removal and semantic segmentation in an integrated system. Specifically, our attack on the rain removal network aims to preserve rain streaks in the output image, while for the semantic segmentation network, we employ powerful existing adversarial attack methods to induce misclassification of the image content. Importantly, CAA innovatively utilizes binary masks to effectively concentrate the aforementioned two significantly disparate perturbation distributions on the input image, enabling attacks on both networks. Additionally, we propose two variants of CAA, which minimize the differences between the two generated perturbations by introducing a carefully designed perturbation interaction mechanism, resulting in enhanced attack performance. Extensive experiments validate the effectiveness of our methods, demonstrating their superior ability to significantly degrade the performance of the downstream task compared to methods that solely attack a single network.
Primary Subject Area: [Content] Media Interpretation
Secondary Subject Area: [Content] Media Interpretation
Relevance To Conference: Systems composed of low-level modality pre-processing networks and downstream task networks for high-level visual tasks are widely present in the real world. For example, in the application of high-level visual algorithms to rainy scenes, the common practice in real-world systems involves employing low-level rain removal networks as pre-processing modules to eliminate rain, followed by a downstream task network to achieve task objectives. The significant disparity between these two modalities makes attacking this integrated system a challenging problem. However, we present the first exploration to simultaneously attack both low-level modality networks and high-level modality networks, revealing the vulnerability of such widely deployed systems in real-world scenarios. This urges us to reconsider the security of such multi-modal, multi-network systems.
Supplementary Material: zip
Submission Number: 4177
Loading