ODE4ViTRobustness: A tool for understanding adversarial robustness of Vision Transformers

Zheng Wang, Wenjie Ruan, Xiangyu Yin

12 May 2023OpenReview Archive Direct UploadReaders: Everyone

Abstract: Understanding the adversarial robustness of Vision Transformers (ViTs) has long been needed since the vulnerability of neural networks hinders their use of it. We present an approach that decomposes the network into submodules and calculates the maximal singular value for each module w.r.t. input, which is a good indication of adversarial robustness. To understand whether Multi-head Self-Attention (MSA) in ViTs contributes to its adversarial robustness, we replace the module with convolutional layers with our decomposing method and conclude that MSA has limited power to defend against adversarial attacks.

0 Replies