NATURALADV: An Exploratory Framework to Balance Adversarial Strength and Stealth in Autonomous Driving Environments

Meriel von Stein

NATURALADV: An Exploratory Framework to Balance Adversarial Strength and Stealth in Autonomous Driving Environments

Meriel von Stein

Published: 16 Dec 2024, Last Modified: 20 Feb 2025airrworkshop PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: adversarial example, generation, testing, robustness

TL;DR: We provide a framework and study for the naturalness-strength trade-off in adversarial perturbation generation.

Abstract: Deep Neural Networks (DNNs) have become integral to various real-world autonomous mobile systems, from self-driving cars to food delivery robots. However, current adversarial attack techniques often focus on maximizing the attack strength at the cost of naturalness, leading to examples that are easily detected by humans or deviate significantly from the expected input distribution. This trade-off between adversarial effectiveness and natural appearance presents a critical challenge in ensuring the robustness and reliability of DNNs in practical settings. In this paper, we introduce a framework to navigate this trade-off. Unlike traditional methods that prioritize pixel-level perturbations, our approach integrates a naturalness metric that reflects human perceptibility and the resemblance of adversarial examples to real-world inputs. The framework leverages pretrained neural networks, differentiable similarity metrics, and high-strength adversarial attacks to automatically generate adversarial images that strike a balance between these two competing objectives. Our method leverages differentiable image similarity metrics and custom loss functions for gradient-based attack generation. Initial empirical results demonstrate the framework’s potential to create adversarial examples that are both powerful and natural-looking, capable of bypassing DNN defenses while maintaining realism. This work aims to offer software engineers a flexible approach to adversarial attack generation, with implications for robustness testing and model evaluation in various real-world contexts. This approach enables higher abstraction of robustness testing above the pixel level, as well as future development of adversarial techniques that consider not only attack strength but also the naturalness of the generated tests, paving the way for more resilient AI systems.

Submission Number: 15

Loading