# Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity
Implementation for Co-Mixup

This is the code for the paper "Co-Mixup: Saliency Guided Joint Mixup with Supermodular Diversity" submitted to ICLR'21. Some parts of the codes are borrowed from manifold mixup ([link](https://github.com/vikasverma1077/manifold_mixup/tree/master/supervised)).

## Requirements
This code is tested with  
python 3.7.6   
pytorch 1.4.0    
torchvision 0.5.0    
gco-wrapper (https://github.com/Borda/pyGCO)

CUDA 10.0  
cuDNN 7603

## Download Checkpoints and Test
We provide a PreActResNet18 checkpoint trained on CIFAR-100 with Co-Mixup. The model shows 80.31% clean test accuracy at the last epoch. Note that, CIFAR-100 dataset will be downloaded at ```[data_path]```, if the dataset is not exist. 

To test the model, run:
```
python main.py --resume ./checkpoint/checkpoint-co-mixup.pth.tar --evaluate --data_dir [data_path]
```


## Reproducing the results
Detailed descriptions of arguments are provided in ```main.py```. Below are some of the examples for reproducing the experimental results. 

### CIFAR-100
Dataset will be downloaded at ```[data_path]``` and the results will be saved at ```[save_path]```. If you want to run codes without saving results, please set ```--log_off True```.

* To reproduce **Co-Mixup with PreActResNet18 for 300 epochs**, run:
```
python main.py --dataset cifar100 --data_dir [data_path] --root_dir [save_path] --labels_per_class 500 --arch preactresnet18  --learning_rate 0.2 --momentum 0.9 --decay 0.0001 --epochs 300 --schedule 100 200 --gammas 0.1 0.1 --match_mix True --mixup_alpha 2.0 --clean_lam 1.0 --lam_dist 0.001 --m_beta 0.32 --m_gamma 1.0 --m_thres 0.83 --m_eta 0.05 --m_block_num 4
```

Belows are commands to reproduce baselines.

* To reproduce **Vanilla with PreActResNet18 for 300 epochs**, run:
```
python main.py --dataset cifar100 --data_dir [data_path] --root_dir [save_path] --labels_per_class 500 --arch preactresnet18  --learning_rate 0.2 --momentum 0.9 --decay 0.0001 --epochs 300 --schedule 100 200 --gammas 0.1 0.1 --train vanilla
```

* To reproduce **input mixup with PreActResNet18 for 300 epochs**, run:
```
python main.py --dataset cifar100 --data_dir [data_path] --root_dir [save_path] --labels_per_class 500 --arch preactresnet18  --learning_rate 0.2 --momentum 0.9 --decay 0.0001 --epochs 300 --schedule 100 200 --gammas 0.1 0.1 --train mixup --mixup_alpha 1.0
```

* To reproduce **manifold mixup with PreActResNet18 for 300 epochs**, run:
```
python main.py --dataset cifar100 --data_dir [data_path] --root_dir [save_path] --labels_per_class 500 --arch preactresnet18  --learning_rate 0.2 --momentum 0.9 --decay 0.0001 --epochs 300 --schedule 100 200 --gammas 0.1 0.1 --train mixup_hidden --mixup_alpha 1.0
```

* To reproduce **CutMix with PreActResNet18 for 300 epochs**, run:
```
python main.py --dataset cifar100 --data_dir [data_path] --root_dir [save_path] --labels_per_class 500 --arch preactresnet18  --learning_rate 0.2 --momentum 0.9 --decay 0.0001 --epochs 300 --schedule 100 200 --gammas 0.1 0.1 --train mixup --mixup_alpha 1.0 --box True
```

* To reproduce **Puzzle Mix with PreActResNet18 for 300 epochs**, run:
```
python main.py --dataset cifar100 --data_dir [data_path] --root_dir [save_path] --labels_per_class 500 --arch preactresnet18  --learning_rate 0.2 --momentum 0.9 --decay 0.0001 --epochs 300 --schedule 100 200 --gammas 0.1 0.1 --train mixup --mixup_alpha 1.0 --graph True --clean_lam 0


### Some notes
- To train models with reduced amounts of training data, set --labels_per_class [number].  
- To reduce training time, set --m_niter 3 or try smaller partition by --m_part [number].  
- Considerable range of parameters are m_beta: [0.5, 2.0], m_thres: [0.81, 0.84]. 
- Clean input regularization by --clean_lam allow us to use high --mixup_alpha. If we set --clean_lam 0, then --mixup_alpha should be decreased accordingly.

---------------------
### Tiny-Imagenet-200

#### Download dataset
The following process is forked from ([link](https://github.com/vikasverma1077/manifold_mixup/tree/master/supervised)).

1.Download the zipped data from https://tiny-imagenet.herokuapp.com/  
2.If not already exiting, create a subfolder "data" in root folder "Co-Mixup"  
3.Extract the zipped data in folder Co-Mixup/data  
4.Run the following script (This will arange the validation data in the format required by the pytorch loader)
```
python load_data.py
```

* To reproduce **Co-Mixup with PreActResNet18 for 1200 epochs**, run:
```
python main.py --dataset tiny-imagenet-200 --data_dir [data_path] --root_dir [save_path] --labels_per_class 500 --arch preactresnet18 --learning_rate 0.2 --momentum 0.9 --decay 0.0001 --epochs 1200 --schedule 600 900 --gammas 0.1 0.1 --match_mix True --mixup_alpha 2.0 --lam_dist 0.001 --clean_lam 1.0  --lam_dist 0.001 --m_beta 0.32 --m_gamma 1.0 --m_thres 0.83 --m_eta 0.05 --m_block_num 4
```

