# How to benchmark TULiP using OpenOOD v1.5?

We plan to submit our OOD detector to OpenOOD in the future. For now, please follow this guide.

## Set up

1. Clone the [OpenOOD v1.5 Github](https://github.com/Jingkang50/OpenOOD) to your local machine. Download the necessary datasets and install dependencies. (Our methods is tested up until commit 275260a).

2. Put `tulip_net.py`, `param_inject.py` into `openood/networks`; `tulip_postprocessor.py` into `openood/postprocessors`; and `tulip.yml` into `configs/postprocessors`.

3. In `openood/evaluation_api/postprocessor.py`, import `TulipPostprocessor` and add the following
    ```python
    postprocessors = {
        ...
        'tulip': TulipPostprocessor
        ...
    }
    ```

4. In `openood/evaluation_api/evaluator.py`, import `TulipNet` and add 
    ```python
    # wrap base model to work with certain postprocessors
    ...
    elif postprocessor_name == 'tulip':
            net = TulipNet(net)
    ```

5. That's it! You can now evaluate TULiP using all functionalits provided by the OpenOOD framework (for table 1, we use mainly `evaluator.eval_ood()`).

## Experiment settings

The **exact hyperparameters** used to create table 1 & 2 are as follow:

| Network | Hyperparameters |
|:--:| :--: |
| CIFAR-10 | K = 1, lambda = 1, delta = 2, power = 1.5/2 |
| CIFAR-100 | K = 1, lambda = 1, delta = 2, power = 1.5/2 |
| ImageNet-200 | K = 1, lambda = 3, delta = 8, power = 1.5 |
| ImageNet-1K | K = 1, lambda = 3, delta = 8, power = 1.5 |
| ImageNet-1K (Cov-Shift) | K = 1, lambda = 1, delta = 8, power = 2 |

Note that those parameters were found using the validation set stated in the paper. The results was average over 3 runs with the 3 weights provided by OpenOOD (s0/s1/s2), except for ImageNet-1K, where we only used `models.ResNet50_Weights.IMAGENET1K_V1` from `torchvision`.

To implement **network architecture** experiments, simply use your network of choice from `torchvision.models`, e.g.
```python
import torchvision.models as models

weights = models.MobileNet_V3_Large_Weights.DEFAULT
net = models.mobilenet_v3_large(weights=weights)
preprocessor = weights.transforms()
```

### To implement Covariate Shift experiments

1. Add the relevant dataset information in `openood/evaluation_api/datasets.py`, such as
```python
DATA_INFO = {
    ...
    'imagenet':
        ...
        'cs':{
            'datasets': ['imagenet_c', ...],
            'imagenet_c': {
                    'data_dir': 'images_largescale/',
                    'imglist_path':
                    'benchmark_imglist/imagenet/test_imagenet_c.txt'
                },
            ...
        }
}
```

2. Add the following method to the `Evaluator` class in `openood/evaluation_api/evaluator.py`.
```python
def eval_csood(self, progress: bool = True):
        id_name = 'id'
        task = 'ood'
        if self.metrics[task] is None:
            self.net.eval()

            # id score
            if self.scores['id']['test'] is None:
                print(f'Performing inference on {self.id_name} test set...',
                      flush=True)
                id_pred, id_conf, id_gt = self.postprocessor.inference(
                    self.net, self.dataloader_dict['id']['test'], progress)
                self.scores['id']['test'] = [id_pred, id_conf, id_gt]
            else:
                id_pred, id_conf, id_gt = self.scores['id']['test']

            # load csood data and compute ood metrics
            cs_metrics = self._eval_ood([id_pred, id_conf, id_gt],
                                         ood_split='cs',
                                         progress=progress)

            if self.metrics[f'{id_name}_acc'] is None:
                self.eval_acc(id_name)
            cs_metrics[:, -1] = np.array([self.metrics[f'{id_name}_acc']] *
                                          len(cs_metrics))

            self.metrics[task] = pd.DataFrame(
                np.concatenate([cs_metrics], axis=0),
                index=list(self.dataloader_dict['ood']['cs'].keys()) + ['csood'],
                columns=['FPR@95', 'AUROC', 'AUPR_IN', 'AUPR_OUT', 'ACC'],
            )
        else:
            print('Evaluation has already been done!')

        with pd.option_context(
                'display.max_rows', None, 'display.max_columns', None,
                'display.float_format',
                '{:,.2f}'.format):  # more options can be specified also
            print(self.metrics[task])

        return self.metrics # [task]

```

3. Now you can use `evaluator.eval_csood()` to invoke the CS-OOD experiment.

#### Blurred ImageNet

For `imagenet_blur`, the following `preprocessor` is used. Add it to `openood/evaluation_api/preprocessor.py`:

```python
class BlurPreProcessor(BasePreprocessor):
    """For test and validation dataset standard image transformation + blur."""
    def __init__(self, config: Config):
        self.transform = tvs_trans.Compose([
            Convert('RGB'),
            tvs_trans.Resize(config.pre_size, interpolation=INTERPOLATION),
            tvs_trans.CenterCrop(config.img_size),
            tvs_trans.GaussianBlur(kernel_size = 11, sigma = 2.0),
            tvs_trans.ToTensor(),
            tvs_trans.Normalize(*config.normalization),
        ])
```

We simply added a `GaussianBlur` with `kernel_size = 11`, `sigma = 2.0` to the standard test transforms.

To use the pre-processor, create an instance and assign it after `preprocessor=` when you create the dataset (e.g., in `openood/evaluation_api/datasets.py`) like:
```python
...
# id_name = "imagenet"
blur_preprocessor = get_default_blur_preprocessor(id_name)
...
# dataloaders for csood
sub_dataloader_dict = {}
for dataset_name in split_config['datasets']:
    dataset_config = split_config[dataset_name]
    dataset = ImglistDataset(
        name='_'.join((id_name, 'ood', dataset_name)),
        imglist_pth=os.path.join(data_root,
                                 dataset_config['imglist_path']),
        data_dir=os.path.join(data_root,
                              dataset_config['data_dir']),
        num_classes=data_info['num_classes'],
        preprocessor=blur_preprocessor if dataset_name == 'imagenet_blur' else preprocessor, # <-- e.g., Here
        data_aux_preprocessor=test_standard_preprocessor)
    ...
...
```

`imagenet_blur` uses the same images as `imagenet` (in either `val` or `test` split). To make an entry for it in `DATA_INFO`, simply copy the setting of `imagenet`:

```python
# Validation
{
    'data_dir': 'images_largescale/',
    'imglist_path': 'benchmark_imglist/imagenet/val_imagenet.txt'
}

# Test
'imagenet_blur': {
    'data_dir': 'images_largescale/',
    'imglist_path':
    'benchmark_imglist/imagenet/test_imagenet.txt'
}
```

Now you can use `imagenet_blur` for validation or test purpose.
