# A Unified Stability Analysis of SAM vs SGD: Role of Data Coherence and Emergence of Simplicity Bias

## Abstract
Understanding the dynamics of optimization algorithms in deep learning has become increasingly critical, especially as models grow in scale and complexity. Despite the empirical success of stochastic gradient descent (SGD) and its variants in finding solutions that generalize well, the precise mechanisms underlying this generalization remain poorly understood. A particularly intriguing aspect of this phenomenon is the bias of optimization algorithms towards certain types of minima—often flatter or simpler—especially in overparameterized regimes. While prior works have associated flatness of the loss landscape with better generalization, tools to mechanistically connect data, optimization algorithms, and the nature of the resulting minima are still limited. For instance, methods like Sharpness-Aware Minimization (SAM) have shown practical gains by explicitly promoting flatness, but lack a unified theoretical framework explaining their influence across different data structures and model architectures. In this work, we introduce a comprehensive linear stability analysis framework to dissect the behavior of optimization algorithms—SGD, random perturbations, and SAM—in neural networks, focusing particularly on two-layer ReLU models. Our approach is built upon a novel coherence measure that captures the interaction between data geometry and gradient similarity, providing new insights into why and how certain solutions are favored.
## Usage
For experiment in local quadratic loss section
```
python job_submit.py
```
We use slurm system internally to manage jobs. You can aslo switch to other way to have all experiments reproduce.

For experiment related to neural network training
```
python job_submit_nn.py
```
You need to change the setting in the file to file your machine. Different experiments can be run through different command line involved. Details are included in the arguments respectively in nn_training.py and run_exp.py
