# Revisiting the Relation Between Robustness and Universality

This is the initial release of the code for the paper "Revisiting the Relation Between Robustness and Universality".

## Setup

First create a virtual environment as follows:
```shell
conda create -n univ python=3.8 pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia --yes
conda activate univ
pip install -r requirements.txt
pip install -e .
```

Download the external data (~200GB), i.e., models and inverted images as described below.

## Data
To reproduce all paper results, the following data is required.
It is distributed over four Zenodo datasets:

| Content                                                                                                    | Link                                                                               |
|------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------|
| TBD                                                              | [https://doi.org/10.5281/zenodo.11244154](https://doi.org/10.5281/zenodo.11244154) |
| TBD | [https://doi.org/10.5281/zenodo.11244274](https://doi.org/10.5281/zenodo.11244274) |
| TBD                                                                    | [https://doi.org/10.5281/zenodo.11244308](https://doi.org/10.5281/zenodo.11244308) |
| TBD                                                                  | [https://doi.org/10.5281/zenodo.11244316](https://doi.org/10.5281/zenodo.11244316) |

### Models

Checkpoints for the pre-trained **l2-robust** CNNs by Salman et al. are available
[here](https://huggingface.co/madrylab/robust-imagenet-models).
Checkpoints for the other models can be downloaded from Zenodo.

### Datasets

**ImageNet:** ImageNet-1k can be downloaded on the
[ImageNet website](https://www.image-net.org/) together with the ILSVRC2012 ImageNet validation set used for evaluation.

**CIFAR-10:** It can be
downloaded [here](https://www.cs.toronto.edu/%7Ekriz/cifar.html).

**SAT-6:** To evaluate the similarity of models on out-of-distribution data, the SAT-6 test set is used, which
can be downloaded from [Kaggle](https://www.kaggle.com/datasets/crawford/deepsat-sat6).


## Model Training

### ImageNet

To train a model on ImageNet, preprocess the ImageNet data into LMDB files using the `folder2lmdb` file contained in the
`utils` folder by running

```
# Creates train.lmdb file
python folder2lmdb.py -f PATH_TO_IMAGENET/ILSVRC -s "train"

# Creates val.lmdb file
python folder2lmdb.py -f PATH_TO_IMAGENET/ILSVRC -s "val"
```

Subsets can be created with `lmdb_creation.ipynb`.

To start the training, run

```
# Train standard ImageNet model, e.g. TinyViT-5m
python imagenet_training.py -s ./data/cnns/imagenet/eps0/ -m tiny_vit_5m

# Train standard ImageNet100 model, e.g. ResNet-18
python imagenet_training.py -s ./data/cnns/imagenet100/eps0/ -m resnet18 -n 100 -d PATH_TO_DIR_WITH_LMDB_FILE

# Train robust TinyViT-5m
python imagenet_training.py -s ./data/cnns/imagenet/eps3/ -m tiny_vit_5m -a 1
```

After training is finished, two checkpoint files `checkpoint.pt.latest` and `checkpoint.pt.best` will be available in
the specified save directory. To evaluate the pre-trained models on the validation set, run

```
# Evaluate standard TinyViT-5m
python vit_training.py -t 0 -v ./data/cnns/imagenet/eps0/tiny_vit_5m.ckpt

# Evaluate robust TinyViT-5m
python imagenet_training.py -t 0 -v ./data/cnns/imagenet/eps3/tiny_vit_5m.ckpt
```

### CIFAR-10

To train a CIFAR-10 model from scratch, run

```
# Train standard ResNet-18
python cifar10_training.py -m resnet18 -s ./data/cnns/cifar10/eps0/

# Train robust ResNet-18
python cifar10_training.py -m resnet18 -s ./data/cnns/cifar10/eps1/ -a 1
```

After training is finished, two checkpoint files `checkpoint.pt.latest` and `checkpoint.pt.best` will be available in
the specified save directory. To evaluate the pre-trained models on the test set, run

```
# Evaluate robust ResNet-18
python cifar10_training.py -t 0 -m resnet18 -v ./data/cnns/cifar10/eps1/resnet18.pt

# Evaluate standard ResNet-18
python cifar10_training.py -t 0 -m resnet18 -v ./data/cnns/cifar10/eps0/resnet18.pt
```

## Generating Inverted Images

### Inverted Images

To generate inverted images for ImageNet or CIFAR-10 CNNs, run

```
# Generate inverted images for eps3 robust ResNet-50 trained on ImageNet
python inversion.py -d imagenet -e eps3 -i ./data/imagenet/inverted/ -m resnet50 -p imagenet -s ./results/cka/inverted/10000/imagenet/seed_indices_0.csv -t ./results/cka/inverted/10000/imagenet/target_indices_0.csv

# Generate inverted images for eps0 ResNet-50 trained on CIFAR-10
python inversion.py -d cifar10 -e eps0 -i ./data/cifar10/inverted/ -m resnet50 -p cifar10 -s ./results/cka/inverted/10000/cifar10/seed_indices_o.csv

# Generate inverted images for eps3 robust ResNet-50 trained on ImageNet using SAT-6
python inversion.py -d sat6 -e eps3 -i ./data/sat6/inverted/imagenet/ -m resnet50 -p imagenet -s ./results/cka/inverted/sat6/seed_indices_o.csv -t ./results/cka/inverted/10000/sat6/target_indices_0.csv
```

## Model Similarity

Computing model similarity requires the pre-trained models along with the datasets described above as well as inverted
images or adversarial examples to evaluate model feature usage. To calculate representational similarity run

```
# Calculate representational similarity on ImageNet
python rep.py --dataset imagenet

# Calculate representational similarity on ImageNet using inverted images
python rep.py --dataset imagenet --inv 1

# Calculate representational similarity on ImageNet using adversarial examples
python rep.py --dataset imagenet --adv 1

# Calculate representational similarity on CIFAR-10
python rep.py --dataset cifar10 --models cifar10

# Calculate representational similarity on SAT-6
python rep.py --dataset sat6 --models imagenet
```

The option `--inv 1` evaluates model similarity on inverted images. The same arguments as shown above can be used to
calculate functional similarity, e.g.

```
# Calculate functional similarity on ImageNet
python func.py --dataset imagenet

# Calculate functional similarity on CIFAR-10
python func.py --dataset cifar10 --models cifar10

# Calculate functional similarity on SAT-6
python func.py --dataset sat6 --models imagenet
```


## Acknowledgements

The code of this project was developed with the help of the following repositories:

- https://github.com/implicitDeclaration/similarity
- https://haydn.fgl.dev/posts/a-better-index-of-similarity/
- https://github.com/js-d/sim_metric
- https://github.com/ahwillia/netrep
- https://github.com/dongxinshuai/RIFT-NeurIPS2021
- https://github.com/MadryLab/robustness
- https://github.com/google-research/google-research/tree/master/do_wide_and_deep_networks_learn_the_same_things
- https://github.com/thecml/pytorch-lmdb
