# Can Pre-trained Networks Detect Familiar Out-of-Distribution Data?
This is the official PyTorch implementation of our ICLR2024 submission paper.

<p align="center">
    <img src=figure/figure_problem_setting.png width="900"> 
</p>

### Abstract
Most studies for OOD detection did not use pre-trained models and trained a backbone from scratch. In recent years, transferring knowledge from large pre-trained models to downstream tasks by lightweight tuning has become mainstream for training classifiers. To bridge the gap between the practice of OOD detection and today's classifiers, the unique and crucial problem is that the samples whose information networks know often come as OOD input. We consider that such data may have a significant influence on the large pre-trained network's performance because the discriminability of these OOD data depends on the pre-training algorithm.
Here, we define such OOD data as PT-OOD (Pre-Trained OOD) data. 
In this paper, we aim to reveal the effect of PT-OOD on the OOD detection performance of pre-trained networks from the perspective of pre-training algorithms.
To achieve this, we explore the PT-OOD detection performance for supervised and self-supervised pre-training algorithms with linear-probing tuning, the most common efficient tuning method. Through our experiments and analysis, we find that the low linear separability of PT-OOD in the feature space heavily degrades the PT-OOD detection performance, and self-supervised models are more vulnerable to PT-OOD than supervised pre-trained models, even with state-of-the-art detection methods. To solve this vulnerability, we further propose a solution unique to large-scale pre-trained models: leveraging the powerful instance-by-instance discriminative representations of pre-trained models and detecting OOD in the feature space independent of ID decision boundaries.

## 1. Requirements
### 1. Environments 
We have done the codes with a single Nvidia V100 or A100 GPU.

- Create a conda virtual environment and activate it:

```bash
conda create -n pre_ood python=3.9.7 -y
conda activate pre_ood
```
-  Install libraries
```bash
pip install -r requirement.txt
```
- Create some folders
```bash
mkdir data
mkdir ssl_methods
```

### 2. Datasets 
#### ID: CIFAR-10, CIFAR-100, Food-101, Caltech
CIFAR-10, CIFAR-100 and Food-101 are automatically downloaded to `data` by implementing `train.py`   
For Caltech-101, we provide the curated dataset via [this url](https://drive.google.com/file/d/187xhBIKm5YdxEr6OJMbAH8jEuZbVu2VS/view?usp=sharing). Original data can be downloaded via [this url](https://data.caltech.edu/records/mzrjq-6wc02).

#### non-PT-OOD: iSUN, LSUN, CIFAR-100, CIFAR-10
Please download the following datasets to `data`.
* [iSUN](https://www.dropbox.com/s/ssz7qxfqae0cca5/iSUN.tar.gz) from https://github.com/alinlab/CSI
* [LSUN](https://www.dropbox.com/s/moqh2wh8696c3yl/LSUN_resize.tar.gz) from https://github.com/facebookresearch/odin/
#### PT-OOD: ImageNet-30, ImageNet-20-O
Please download the following datasets via [this url](https://drive.google.com/file/d/187xhBIKm5YdxEr6OJMbAH8jEuZbVu2VS/view?usp=sharing) and mv them to `data`.



The overall file structure is as follows:
```

|-- data
    |-- caltech99
    |-- cifar-10-batches-py (Automatically downloaded)
    |-- cifar-100-python (Automatically downloaded)
    |-- IN20
    |-- IN30_test_cifar10
    |-- IN30_test_food
    |-- iSUN
    |-- LSUN_resize
    ...
```

### 3. Pre-trained self-supervised models
please download the following checkpoint to `ssl_methods`.  
* [BYOL](https://drive.google.com/u/0/uc?id=1TLZHDbV-qQlLjkR8P0LZaxzwEE6O_7g1&export=download) from https://github.com/yaox12/BYOL-PyTorch     
* [SwAV](https://dl.fbaipublicfiles.com/deepcluster/swav_200ep_pretrain.pth.tar) from https://github.com/facebookresearch/swav      
* [MoCo v2](https://dl.fbaipublicfiles.com/moco/moco_checkpoints/moco_v2_200ep/moco_v2_200ep_pretrain.pth.tar) from https://github.com/facebookresearch/moco
* [iBOT](https://lf3-nlp-opensource.bytetos.com/obj/nlp-opensource/archive/2022/ibot/vits_16/checkpoint_teacher.pth) from https://github.com/bytedance/ibot
* [DINO](https://dl.fbaipublicfiles.com/dino/dino_deitsmall16_pretrain/dino_deitsmall16_pretrain.pth) from https://github.com/facebookresearch/dino 

## 2. Training
Please refer to  `run_train.sh`. 


## 3. Evaluation
Please refer to  `run_eval.sh`. 


For other OOD detection methods, we use the code in [ViM](https://github.com/haoqiwang/vim).

## 4. Acknowledgement
This code is based on the implementations of [MoCo](https://github.com/facebookresearch/moco) for training and [ViM](https://github.com/haoqiwang/vim) for evaluation.