Abstract: Neural networks are increasingly prevalent in day-to-day life, including in safety-critical applications such as self-driving cars and medical diagnoses. This prevalence has spurred extensive research into testing the robustness of neural networks against adversarial attacks, most commonly by determining if misclassified inputs can exist within a region around a correctly classified input. While most prior work focuses on robustness analysis around a single input at a time, in this paper we look at simultaneous analysis of multiple robustness regions. Our approach finds robustness violating inputs away from expected decision boundaries, identifies varied types of misclassifications by increasing confusion matrix coverage, and effectively discovers robustness violating inputs that do not violate input feasibility constraints. We demonstrate the capabilities of our approach on multiple networks trained from several datasets, including ImageNet and a street sign identification dataset.
Loading