# Quick Evaluation of CorruptEncoder

## Introduction

This repo provides pre-trained clean/backdoored encoders [[URL](https://drive.google.com/file/d/1Vxijq-ROo3ZAVPPNzFu0uhT8EyEHckb6/view?usp=sharing)] and the code for linear evaluation to evaluate the attack performance of different backdoor attacks to contrastive learning. In particular, we provide clean encoder, backdoored encoders of **SSL-backdoor**, **PoisonedEncoder** and **CorruptEncoder(+)** to attack different downstream tasks (i.e., ImageNet100-A and ImageNet100-B). 

Each encoder is pre-trained under the default setting with the same poisoning ratio of **0.5%**. Moreover, our CorruptEncoder only uses **3** reference images to achieve an ASR>90%.

## Setup environment

Install the [pytorch](https://pytorch.org/). The latest codes are tested on PyTorch 1.7 and Python 3.6 (also compatible to Pytorch 2.0 and Python 3.10).

## Usage

1. Generate the filelists of a downstream task (please specify the path to ImageNet dataset):
		

		cd get_downstream_dataset
        
        python3 gen_downstream_filelist.py --downstream_task_name imagenet100_A --downstream_train_ratio 0.1 --save_dir ../data/imagenet100_A

        python3 gen_downstream_filelist.py --downstream_task_name imagenet100_B --downstream_train_ratio 1.0 --save_dir ../data/imagenet100_B


2. Download pre-trained encoders from this [[URL](https://drive.google.com/file/d/1Vxijq-ROo3ZAVPPNzFu0uhT8EyEHckb6/view?usp=sharing)] and put 'ckpt' at the root folder. (or use provided bash file)

        CorruptEncoder
        └── train_moco
        └── ckpt
            └── ImageNet100-A-CorruptEncoder
            └── ImageNet100-B-CorruptEncoder
            └── ...
        └── ...

3. Train the linear layer for a downstream classifier:


		cd train_moco

		### Use bash
		bash run_linear.sh

		### Use command-line
		CUDA_VISIBLE_DEVICES=0 python3 eval_linear.py \
                         --arch moco_resnet18 \
                         --weights ../ckpt/ImageNet100-A-CorruptEncoder/checkpoint_0199.pth.tar \
                         --train_file ../data/imagenet100_A/ds_train.txt \
                         --val_file ../data/imagenet100_A/ds_test.txt


5. Evaluate the **CA** (clean accuracy) and **ASR** (attack success rate) of a downstream classifer built based on a clean/backdoored encoder:

		### Use bash
		bash run_test.sh

  		### Use command-line
   		CUDA_VISIBLE_DEVICES=0 python3 eval_linear.py \
                        --arch moco_resnet18 \
                        --evaluate \
                        --eval_data exp \
                        --load_cache \
                        --weights ../ckpt/ImageNet100-A-CorruptEncoder/checkpoint_0199.pth.tar \
                        --resume ../ckpt/ImageNet100-A-CorruptEncoder/linear/checkpoint_0199.pth.tar \
                        --val_file ../data/imagenet100_A/ds_test.txt \
                        --val_poisoned_file ../data/imagenet100_A/ds_poisoned_test.txt  \
                        --target_cls 20
		


## Results

Here we illustrate the expected results of each pre-trained encoder provided in this repo:

| Model | Downstream Task | CA | ASR |
:-: | :-: | :-: | :-:
| Clean | ImageNet100-A | 68.8 | 0.3 |
| CorruptEncoder | ImageNet100-A | 69.2 | **97.3** |
| ----- | ----- | ----- | ----- |
| Clean | ImageNet100-B | 61.2 | 0.4 |
| CorruptEncoder | ImageNet100-B | 61.6 | 92.9 |
| CorruptEncoder+ | ImageNet100-B | 61.7 | **99.5** |
| PoisonedEncoder | ImageNet100-B | 61.1 | 35.5 |
| SSL-Backdoor | ImageNet100-B | 61.3 | 14.3 |
