Parameterizing Activation Functions for Adversarial Robustness

Sihui Dai, Saeed Mahloujifar, Prateek Mittal

Published: 01 Jan 2022, Last Modified: 21 Feb 2024SP (Workshops) 2022Readers: Everyone

Abstract: Deep neural networks are known to be vulnerable to adversarially perturbed inputs. A commonly used defense is adversarial training, whose performance is influenced by model architecture. While previous works have studied the impact of varying model width and depth on robustness, the impact of using learnable parametric activation functions (PAFs) has not been studied. We study how using learnable PAFs can improve robustness in conjunction with adversarial training. We first ask the question: Can changing activation function shape improve robustness? To address this, we choose a set of PAFs with parameters that allow us to independently control behavior on negative inputs, inputs near zero, and positive inputs. Using these PAFs, we train models using adversarial training with fixed PAF shape parameter values. We find that all regions of PAF shape influence the robustness of obtained models, however only variation in certain regions (inputs near zero, positive inputs) can improve robustness over ReLU. We then combine learnable PAFs with adversarial training and analyze robust performance. We find that choice of activation function can significantly impact the robustness of the trained model. We find that only certain PAFs, such as smooth PAFs, are able to improve robustness significantly over ReLU. Overall, our work puts into context the importance of activation functions in adversarially trained models.

0 Replies