DepthCloak: Projecting Optical Camouflage Patches for Erroneous Monocular Depth Estimation of Vehicles
Abstract: Adhesive adversarial patches have been common used in attacks against the computer vision task of monocular depth estimation (MDE). Compared to physical patches permanently attached to target objects, optical projection patches show great flexibility and have gained wide research attention. However, applying digital patches for direct projection may lead to partial blurring or omission of details in the captured patches, attributed to high information density, surface depth discrepancies, and non-uniform pixel distribution. To address these challenges, in this work we introduce DepthCloak, an adversarial optical patch designed to interfere with the MDE of vehicles. To this end, we first simplify the patch to a gray pattern because the projected ``black-and-white light'' has strong robustness to ambient light. We propose a GAN-based approach to simulate projections and deduce a projectable list. Then, we employ neighborhood averaging to fill sparse depth values, compress all depth values into a reduced dynamic range via nonlinear mapping, and use these values to adjust the Gaussian blur radius as weight parameters, thereby simulating depth variation effects. Finally, by integrating Moiré pattern and applying style transfer techniques, we customize adversarial patches featuring regularly arranged characteristics. We deploy DepthCloak in real driving scenarios, and extensive experiments demonstrate that DepthCloak can achieve depth errors of over 9 meters in both bright and night-time conditions while achieving an attack success rate of over 80\% in the physical world.
Primary Subject Area: [Experience] Multimedia Applications
Secondary Subject Area: [Content] Media Interpretation
Relevance To Conference: Our work bridges traditional multimedia operations, such as camera-projector systems, with multimodal data processing frameworks encompassing visual and spatial information perceived by vehicle sensors. This integration showcases a unique intersection among computer vision, optical engineering, and vehicular technology. DepthCloak not only sheds light on vulnerabilities within current vehicular perception systems but also paves the way for the development of more robust multimodal fusion algorithms for automotive security.
Furthermore, our research underscores the potential of multimedia and multimodal processing to tackle complex challenges in the domain of automotive visual security. This aligns seamlessly with the objective of the SIGMM community, which aims to advance research beyond traditional unimedia processing paradigms. By leveraging diverse disciplines and methodologies, our work contributes to the ongoing evolution of automotive security and highlights the importance of interdisciplinary collaboration in driving innovation within this field.
Supplementary Material: zip
Submission Number: 4173
Loading