# Code

## grad_indep.py

grad_indep.py gives Figure 2(b) in our paper. It captures the maximum absolute value of the inner product between $\nabla\beta_{i}(\theta_{t})$ and $\nabla\beta_{j}(\theta_{t})$, per $10$ iterations for the first $100$ iterations. We use a 3-block CNN and MNIST dataset. 

## beta_domin.py

beta_domin.py gives Figure2(c) in our paper. It captures the log relationship between $|\beta_{i}(\theta_{t})|$ and $\|\nabla\beta_{i}(\theta_{t})\|$ for the first $400$ iterations. We use a 3-block CNN and MNIST dataset. 

## small-init.py

small-init.py gives Figure 2(a) in our paper. It calculates the initial $|\beta_{i}(\theta_{0})|$ and indexed by the coefficients of final A-CK. 

## train.py

It will return basis coefficients, train loss, and test/train accuracy for the given model, datasets, and model parameters.

Example:

    python3 train.py --call_wandb --model='alexnet' --init_scale=1 --data='Cifar10' --loss_fn='mse_loss' --optimizer='lars' --batch_size=512 --num_epoch=300 --lr_setting 2 0 1e-2 --decay_rate=0.33 --decay_stepsize=50 --seed=0 --num_run=0 --no-gradient 
