TL;DR: We show that the popular average certified radius metric is improper to evaluate randomized smoothing, and suggest new metrics in replacement.
Abstract: Randomized smoothing (RS) is popular for providing certified robustness guarantees against adversarial attacks. The average certified radius (ACR) has emerged as a widely used metric for tracking progress in RS. However, in this work, for the first time we show that ACR is a poor metric for evaluating robustness guarantees provided by RS. We theoretically prove not only that a trivial classifier can have arbitrarily large ACR, but also that ACR is extremely sensitive to improvements on easy samples. In addition, the comparison using ACR has a strong dependence on the certification budget. Empirically, we confirm that existing training strategies, though improving ACR, reduce the model's robustness on hard samples consistently. To strengthen our findings, we propose strategies, including explicitly discarding hard samples, reweighing the dataset with approximate certified radius, and extreme optimization for easy samples, to replicate the progress in RS training and even achieve the state-of-the-art ACR on CIFAR-10, without training for robustness on the full data distribution. Overall, our results suggest that ACR has introduced a strong undesired bias to the field, and its application should be discontinued in RS. Finally, we suggest using the empirical distribution of $p_A$, the accuracy of the base model on noisy data, as an alternative metric for RS.
Lay Summary: Artificial intelligence (AI) is becoming increasingly common in various areas of our lives, making it important to ensure these AI systems are trustworthy and reliable. Randomized smoothing is one of the most popular methods used to make AI models more trustworthy. However, this paper identifies an important and widely spread problem in the evaluation of randomized smoothing.
We find that one widely used evaluation metric allows the neural network to focus only on easy data and ignore hard ones. This could lead to critical problems such as unfairness. We show this problem both theoretically and empirically, with extensive evidence demonstrating that this problem has introduced strong selection bias into the development of the algorithms.
To address this issue, we suggest alternative metrics to replace the current one. This will make the evaluation of the future development of trustworthy AI more reliable.
Link To Code: https://github.com/eth-sri/acr-weakness
Primary Area: Deep Learning->Robustness
Keywords: randomized smoothing, metric, average certified radius
Submission Number: 7092
Loading