# Revitalizing CNN Attention via Transformers in Self-Supervised Visual Representation Learning

This repository is the official implementation of CARE. 
![Graph](images/pipeline.png)

## Requirements

To install requirements:

```setup
conda create -n care python=3.6
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
pip install tensorboard
pip install ipdb
pip install einops
pip install loguru
pip install pyarrow==3.0.0
pip install tqdm
```

>📋  Pytorch>=1.6 is needed for runing the code.
## Data Preparation

Prepare the ImageNet data in {data_path}/train.lmdb and {data_path}/val.lmdb

Relpace the original data path in care/data/dataset_lmdb (Line7 and Line40) with your new {data_path}.

>📋  Note that we use the lmdb file to speed-up the data-processing procedure.

## Training

Before training the ResNet-50 (100 epoch) in the paper, run this command first to add your PYTHONPATH:

```train
export PYTHONPATH=$PYTHONPATH:{your_code_path}/care_5trans/
export PYTHONPATH=$PYTHONPATH:{your_code_path}/care_5trans/care/
```

Then run the training code via:

```train
bash run_train.sh      (The training script is used for trianing CARE with 8 gpus)
bash debug_train.sh    (We also provide the script for trainig CARE with only one gpu)
```

>📋  The training script is used to do unsupervised pre-training of a ResNet-50 model on ImageNet in an 8-gpu machine
>1. using `-b` to specify batch_size, e.g., `-b 128`
>2. using `-d` to specify gpu_id for training, e.g., `-d 0-7`
>3. using `--log_path`  to specify the main folder for saving experimental results.
>4. using `--experiment-name` to specify the folder for saving training outputs.
>
>The code base also supports for training other backbones (e.g., ResNet101 and ResNet152) with different training schedules (e.g., 200, 400 and 800 epochs).
## Evaluation
Before start the evaluation, run this command first to add your PYTHONPATH:

```eval
export PYTHONPATH=$PYTHONPATH:{your_code_path}/care_5trans/
export PYTHONPATH=$PYTHONPATH:{your_code_path}/care_5trans/care/
```

Then, to evaluate the pre-trained model (e.g., ResNet50-100epoch) on ImageNet, run:

```eval
bash run_val.sh      (The training script is used for evaluating CARE with 8 gpus)
bash debug_val.sh    (We also provide the script for evaluating CARE with only one gpu)
```

>📋  The training script is used to do the supervised linear evaluation of a ResNet-50 model on ImageNet in an 8-gpu machine
>1. using `-b` to specify batch_size, e.g., `-b 128`
>2. using `-d` to specify gpu_id for training, e.g., `-d 0-7`
>3. Modifying `--log_path`  according to your own config.
>4. Modifying `--experiment-name` according to your own config.

## Pre-trained Models

We here provide some pre-trained models :

- [ResNet-50 100epoch](https://drive.google.com/file/d/193MHYAcb1WlGaR0RA2MOBRFqViXr_BQx/view?usp=sharing) trained on ImageNet using ResNet-50 with 100 epochs. 
- [ResNet-50 200epoch](https://drive.google.com/file/d/1tSjRpts6WvtFfgGq3npfxlIUdLkNUSvx/view?usp=sharing) trained on ImageNet using ResNet-50 with 200 epochs. 
- [ResNet-50 400epoch](https://drive.google.com/file/d/15M8fEssCyiD9IMmWAEf1ytDVQCVa1Lwe/view?usp=sharing) trained on ImageNet using ResNet-50 with 400 epochs. 


>📋  We will provide more pretrained models in the future.
## Results

Our model achieves the following performance on :

### Self-supervised learning on image classifications.

| Method   | Backbone  | epoch |Top 1 Accuracy  | Top 5 Accuracy | pretrained model  | linear evaluation model |
| ---------| --------- | ------|--------------- | -------------- | ----------------- | ----------------------- |
| CARE     | ResNet50  | 100   |    72.02%      |      90.02%    | [pretrained](https://drive.google.com/file/d/193MHYAcb1WlGaR0RA2MOBRFqViXr_BQx/view?usp=sharing) | [linear_model](https://drive.google.com/file/d/19XhIeg_bZLqWQ3KL0wsXiX6DN00dWapf/view?usp=sharing) |
| CARE     | ResNet50  | 200   |    73.78%      |      91.50%    | [pretrained](https://drive.google.com/file/d/1tSjRpts6WvtFfgGq3npfxlIUdLkNUSvx/view?usp=sharing) | [linear_model](https://drive.google.com/file/d/1NBrEsMkM0fWOUS1TRTvT4rmb6jodOOON/view?usp=sharing) |
| CARE     | ResNet50  | 400   |    74.68%      |      91.97%    | [pretrained](https://drive.google.com/file/d/15M8fEssCyiD9IMmWAEf1ytDVQCVa1Lwe/view?usp=sharing) | [linear_model](https://drive.google.com/file/d/1IukecCzobtey4ezdRdwa5v2YsR5WqMW_/view?usp=sharing) |

### Transfer learning to object detection and semantic segmentation.
#### COCO det
| Method   | Backbone  | epoch | AP_bb | AP_bb_50 | AP_bb_75 | pretrained model  | det/seg model |
| ---------| --------- | ------|------ | -------- | -------- | ----------------- | ----------------------- |
| CARE     | ResNet50  | 200   | 39.4  |  59.2    |  42.6    | [pretrained](https://drive.google.com/file/d/1tSjRpts6WvtFfgGq3npfxlIUdLkNUSvx/view?usp=sharing) | [model] |
| CARE     | ResNet50  | 400   | 39.6  |  59.4    |  42.9    | [pretrained](https://drive.google.com/file/d/15M8fEssCyiD9IMmWAEf1ytDVQCVa1Lwe/view?usp=sharing) | [model] |
| CARE | ResNet50-FPN  | 200   | 39.5  |  60.2    |  43.1    | [pretrained](https://drive.google.com/file/d/1tSjRpts6WvtFfgGq3npfxlIUdLkNUSvx/view?usp=sharing) | [model] |
| CARE | ResNet50-FPN  | 400   | 39.8  |  60.5    |  43.5    | [pretrained](https://drive.google.com/file/d/15M8fEssCyiD9IMmWAEf1ytDVQCVa1Lwe/view?usp=sharing) | [model] |

#### COCO instance seg
| Method   | Backbone  | epoch | AP_mk | AP_mk_50 | AP_mk_75 | pretrained model  | det/seg model |
| ---------| --------- | ------|------ | -------- | -------- | ----------------- | ----------------------- |
| CARE     | ResNet50  | 200   | 34.6  |  56.1    |  36.8    | [pretrained](https://drive.google.com/file/d/1tSjRpts6WvtFfgGq3npfxlIUdLkNUSvx/view?usp=sharing) | [model] |
| CARE     | ResNet50  | 400   | 34.7  |  56.1    |  36.9    | [pretrained](https://drive.google.com/file/d/15M8fEssCyiD9IMmWAEf1ytDVQCVa1Lwe/view?usp=sharing) | [model] |
| CARE | ResNet50-FPN  | 200   | 35.9  |  57.2    |  38.5    | [pretrained](https://drive.google.com/file/d/1tSjRpts6WvtFfgGq3npfxlIUdLkNUSvx/view?usp=sharing) | [model] |
| CARE | ResNet50-FPN  | 400   | 36.2  |  57.4    |  38.8    | [pretrained](https://drive.google.com/file/d/15M8fEssCyiD9IMmWAEf1ytDVQCVa1Lwe/view?usp=sharing) | [model] |

#### VOC07+12 det
| Method   | Backbone  | epoch | AP    | AP_50    | AP_75    | pretrained model  | det/seg model |
| ---------| --------- | ------|------ | -------- | -------- | ----------------- | ----------------------- |
| CARE     | ResNet50  | 200   | 57.7  |  83.0    |  64.5    | [pretrained](https://drive.google.com/file/d/1tSjRpts6WvtFfgGq3npfxlIUdLkNUSvx/view?usp=sharing) | [model] |
| CARE     | ResNet50  | 400   | 57.9  |  83.0    |  64.7    | [pretrained](https://drive.google.com/file/d/15M8fEssCyiD9IMmWAEf1ytDVQCVa1Lwe/view?usp=sharing) | [model] |






>📋  More results are provided in the paper.
>
## Contributing

>📋  WIP
