# SimDEx

This repository contains the code for our "Learning Coarse-Grained Representations: An Exploration of Mutual Information via Hyperspherical Density" paper. We added the code for our method on top of the excellent excellent code made by USER USER and Bálint Gyires-Tóth for [*Whitening Consistently Improves Self-Supervised Learning*](https://arxiv.org/abs/2408.07519) .

## How to run
For each SSL method, we provide a script to run the training. The scripts are located in the [pretrain](pretrain) folder.

The following pretraining methods are implemented:
- [Barlow Twins](pretrain/train_barlowtwins.py)
- [BYOL](pretrain/train_byol.py)
- [SimCLR](pretrain/train_simclr.py)
- [SwAV](pretrain/train_swav.py)
- [VICReg](pretrain/train_vicreg.py)
- [Supervised](pretrain/train_supervised.py)

E.g. to run BYOL pretraining:


    CUDA_VISIBLE_DEVICES=0 PYTHONPATH=. python pretrain/train_byol.py


## Setup

We recommend using the provided Docker container to run the code. 

### Option A: Start Docker container and connect to it via ssh: 
1. Create a keypair, copy the public key to the root of this repo and name it `cm-docker.pub`!
2. Run `make ssh`.
3. Connect on port 2233 `ssh root@<hostname> -i <private_key_path> -p 2222`.

To run the container without starting an ssh server, run `make run`.

To customize Docker build and run, edit the [Makefile](Makefile) or the [Dockerfile](Dockerfile).

> [!WARNING]
> `make ssh` and `make run` start the container with the `--rm` flag! Only contents of the `/workspace` persist if the container is stopped (via a simple volume mount)!

### Option B: Install dependencies locally (not tested)

Install the requirements with `pip install -r requirements.txt`.

### How to run it

A testing run can be done using:

```
python -m pretrain.train_shannon_hyperspherical --config-name=simdex_dryrun
```

### Dataset setup
To set the path for the datasets, edit the [Makefile](Makefile)'s `data_path=...` line.

CIFAR-10 and STL-10 download automatically; to set up TinyImageNet, we provide a script: [utils/tiny_imagenet_setup.py](utils/tiny_imagenet_setup.py).

## Generating attention/gradient maps, other visualizations
To generate the attention maps or gradient maps, simply run the notebooks:
- attention_exploration_all.ipynb
- im1k_SmoothGrad-cam-all.ipynb

To generate the plots in the paper, simply run the notebooks in reports folder.

For the ImageNet-100 metrics and visualizations use tsne-im100.ipynb.


## Copyright, acknowledgements
The original code was implemented in https://github.com/kaland313/SSL-Whitening , as the official implementation of [*Whitening Consistently Improves Self-Supervised Learning*](https://arxiv.org/abs/2408.07519). 

Our implementation is based on the  <img src="https://github.githubassets.com/pinned-octocat.svg" style="height:12pt;" /> [Lightly library](https://github.com/lightly-ai/lightly).
