Abstract: The challenge of deep neural network (DNN) explainability continues to be a significant hurdle in developing trustworthy AI, particularly in essential fields like medical imaging. Despite progress in explainable AI (XAI), these methods remain susceptible to adversarial images, emphasizing the urgent need for robustness evaluation. While many current adversarial attack techniques focus on specific explanation strategies, emerging research has introduced black-box methods capable of targeting multiple approaches. However, such methods often necessitate a large number of queries due to the complexity of pixel-level modifications. In response, we propose an innovative attack method that employs semi-transparent, RGB-valued circles to create perturbations, optimizing their features via an evolutionary strategy, drastically reducing the number of tunable optimization parameters required. Through experiments on medical image datasets, our method demonstrates superior performance compared to current leading techniques. This study further underscores the vulnerabilities of XAI methods in critical sectors such as medical imaging, advocating for more robust solutions.
Submission Length: Long submission (more than 12 pages of main content)
Assigned Action Editor: ~Yingzhen_Li1
Submission Number: 5281
Loading