Abstract: We introduce a new perspective on adversarial vulnerability in image classification: fragility can arise from poor convergence in off-manifold directions. We model data as lying on low-dimensional manifolds, where on-manifold directions correspond to high-variance, data-aligned features and off-manifold directions capture low-variance, nuanced features. Standard first-order optimizers, such as gradient descent, are inherently ill-conditioned, leading to slow or incomplete convergence in off-manifold directions. When data is inseparable along the on-manifold direction, robustness depends on learning these subtle off-manifold features, and failure to converge leaves models exposed to adversarial perturbations.
On the theoretical side, we formalize this mechanism through convergence analyses of logistic regression and two-layer linear networks under first-order methods. These results highlight how ill-conditioning slows or prevents convergence in off-manifold directions, thereby motivating the use of second-order methods which mitigate ill-conditioning and achieve convergence across all directions. Empirically, we demonstrate that even without adversarial training, robustness improves significantly with extended training or second-order optimization, underscoring convergence as a central factor.
As an auxiliary empirical finding, we observe that batch normalization suppresses these robustness gains, consistent with its implicit bias toward uniform-margin rather than max-margin solutions.
By introducing the notions of on- and off-manifold convergence, this work provides a novel theoretical explanation for adversarial vulnerability.
Submission Type: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Below we summarize the changes in the final revision:
- We have addressed minor typos pointed out by *Reviewer j4Br*, for instance the equation numbering reference under theorem 4.1 has been fixed.
- We have expanded the **related work** section to include new references. suggested by *Reviewer kcM2 and kj1y*.
- We have added additional simulation section (6.1) and figure 3, to showcase the convergence behavior when the dimensionality varies in the manifold overlap and no-overlap setting, for both ADAM (first-order) and KFAC (second-order) optimmization in terms of number of steps/ wall-time required to attain robustness or optimal decision boundary. This helps to further strengthen our link of our theory to real-life experiment results. This addresses the discussion (**Computational Complexity and Timing Analysis**
) we had with reviewer *Reviewer kcM2*.
- We made changes in writing of section 5, for more clarity and distinction between curvature dependent adaptive step-size bound offered by second order method that solves ill-condition vs some variable step-size first order method like ADAM, ADAGRAD, etc; which still face ill-conditioning as the effective step size is still bounded by a global lipschitz constant. This pertains to the discussion we had with reviewer *kj1y* on **Misconception Regarding Adam vs. Second-Order Conditioning** .
- Finally, we slightly rephrased the theorem statements 4.2, 4.3 to address the confusion (**Clarification on Theorem 4.2**) that we discussed with reviewer *kj1y*.
Code: https://github.com/rhaldarpurdue/Adversarial_Vulnerability_Convergence_code
Assigned Action Editor: ~Olivier_Cappé2
Submission Number: 5845
Loading