Smoothed-SGDmax: A Stability-Inspired Algorithm to Improve Adversarial Generalization

Jiancong Xiao; Jiawei Zhang; Zhi-Quan Luo; Asuman E. Ozdaglar

Smoothed-SGDmax: A Stability-Inspired Algorithm to Improve Adversarial Generalization

Jiancong Xiao, Jiawei Zhang, Zhi-Quan Luo, Asuman E. Ozdaglar

Published: 01 Feb 2023, Last Modified: 13 Feb 2023Submitted to ICLR 2023Readers: Everyone

Keywords: Adversarial Training, Robust Overfitting, Generalization Bound

Abstract: Unlike standard training, deep neural networks can suffer from serious overfitting problems in adversarial settings, which is studied extensively by empirical papers. Recent research (e.g., Xing et al. (2021); Xiao et al. (2022)) show that SGDmax-based adversarial training algorithms with $1/s(T)$ training loss incurs a stability-based generalization bound in $\Theta(c+s(T)/n)$. Here $T$ is the number of iterations, $n$ is the number of samples, $s(T)\rightarrow \infty$ as $T\rightarrow \infty$, and $c$ is a $n$-independent term. This reveals that adversarial training can have nonvanishing generalization errors even if the sample size $n$ goes to infinity. A natural question arises: can we eliminate the nonvanishing term $c$ by designing a more generalizable algorithm? We give an affirmative answer in this paper. First, by an adaptation of information-theoretical lower bound on the complexity of solving Lipschitz-convex problems using randomized algorithms, we show that a minimax lower bound for adversarial generalization gap is $\Omega(s(T)/n)$ given training loss $1/s(T)$. This implies that SGDmax does not achieve the lower bound. Next, by observing that the nonvanishing generalization error term for SGDmax comes from the non-smoothness of the adversarial loss function, we employ a smoothing technique to smooth the adversarial loss function. Based on the smoothed loss function, we design a smoothed SGDmax algorithm achieving generalization bound $\mathcal{O}(s(T)/n)$, which matches the minimax lower bound. Experimentally, we show that our algorithm improves adversarial generalization on common datasets.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Theory (eg, control theory, learning theory, algorithmic game theory)

15 Replies

Loading