Abstract: The robustness of a neural network against adversarial examples is essential when a deep classifier is applied in safety-critical use cases like health care or autonomous driving. To assess the robustness, practitioners use various tools ranging from adversarial attacks to the exact computation of the distance to the decision boundary. We use the fact that the robustness of a neural network is a local property and empirically show that computing the same metrics for smaller local substitute networks yields reasonable estimates of the robustness for a lower cost. To construct the substitute network, we develop several pruning techniques that preserve the local properties of the initial network around a given anchor point. Our experiments on multiple datasets prove that this approach saves a significant amount of computation.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Meisam_Razaviyayn1
Submission Number: 4101
Loading