# Experimental code of ICLR 2022 submission entitled `Sharp Learning Bounds for Contrastive Unsupervised Representation Learning`


## System related versions

- python: 3.6.8
- CUDA: 11.2
- cudnn: 8005

## Create experimental environment

```bash
pip install -r requirements.txt

git clone git@github.com:NVIDIA/apex.git
cd apex
git checkout 54b93919aadc117cbab1fe5a2af4664bb9842928
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

# install gnu-parallel from source
wget https://ftp.gnu.org/gnu/parallel/parallel-20210622.tar.bz2
tar -jxvf parallel-20210622.tar.bz2
cd parallel-20210622
./configure --prefix=~/bin
make
make install
```

## Execute training code

- Toy: Please run `code/jobs/toy/circle.sh`.
- Vision: Please run `{cifar10,cifar100}/contrastive/seed_*.sh` under [`code/jobs/vision`](./code/jobs/vision) directory.
- Language: Please run [`code/jobs/language/wiki3029/contrastive.sh`](code/jobs/language/wiki3029/contrastive.sh).

## Evaluation

### Preparation

Please run [`./code/notebooks/extract_eval_weights_from_wandb.ipynb`](./code/notebooks/extract_eval_weights_from_wandb.ipynb).

### Perform evaluation to draw plots


- Vision: Please run the `{cifar10,cifar100}/mean_eval.sh`, `{cifar10,cifar100}/linear_eval.sh` and `{cifar10,cifar100}/contrastive_eval.sh` scripts under [`code/jobs/vision`](./code/jobs/vision) directory.
- Language: Please run the `mean_eval.sh`, `linear_eval.sh` and `contrastive_eval.sh` scripts under [`code/jobs/language/wiki3029`](./code/jobs/language/wiki3029) directory.
### Create plots

Please run the following notebooks to generate plots

#### Synthetic experiments

- `scripts/compare_upper_bound.py`
- `scripts/make_toy_figure.py`
- `scripts/plot_k_nceloss.py`
- `scripts/plot_k_msuploss.py`
- `scripts/plot_toy_trajectory.py`

#### Real benchmark datasets

- `code/notebooks/cv_heatmap.ipynb`
- `code/notebooks/bound.ipynb`

---


**Note:**: this codebase tracks experiments using Weights & Biases, but the default wandb might cause [hanging at the beginning of training]((https://docs.wandb.ai/guides/track/advanced/distributed-training#hanging-at-the-beginning-of-training)) of ddp training code. To avoid this, please set an environment variable as follows:

```bash
WANDB_START_METHOD="thread"
```
