## CLIP-CAV

### Required Packages

```
pytorch==2.0.0+cu117
torchvision==0.15.0+cu117
huggingface-hub==0.20.2
sentence-transformers==2.3.1
pytorchcv==0.0.67
imageio==2.33.1
ftfy==6.1.1
```

### Data Preparation

1. Place the ImageNet dataset in `datasets/ILSVRC2012`.
2. Extract the image features, text features of CLIP model by running the following instructions.

```
python save_clip_class_cavs.py
```

3. Extract the image features of the target model based on the ResNet18 backbone by running the following instructions.

```
python save_target_class_cavs.py \
    --model resnet18 \
    --weights ResNet18_Weights.IMAGENET1K_V1 \
    --test-only
```

### Training Instructions

An example of training CLIP-CAVs on the ResNet18 backbone:

```
python train_cav.py \
    --model resnet18 \
    --lr 1e-1 \
    --epochs 3 \
    --seed 1028 \
    --batch-size 1024 \
    --output-dir output_cosine_cav_1000classes
```

An example of conducting model debug on the whole ImageNet over the ResNet18 backbone:

```
python train_target_model.py \
    --data-set ImageNet \
    --model resnet18 \
    --lr 1e-3 \
    --data-T 3 \
    --batch-size 512 \
    --epochs 20 \
    --seed 1028 \
    --output-dir output_cosine \
    --use-target-cav
```