<br>

# A distinct unsupervised reference model from the environment helps continual learning (ICLR 2023)

This repository provides PyTorch implementations for URSL, a novel model for open-set semi-supervised continual learning. The directory outline is as follows:

```bash
root
 ├── dataset        # The folder containing all pytorch codes related to dataset and dataloaders for model
    ├── dataloaders.py      # all functions which provide data for the model
    ├── datasets.py     # configurations for all datasets
    ├── load_tiny_imagenet.py       # functions to load train and test data for Tiny ImageNet dataset

 ├── model      # The folder containing all models for URSL like student and teacher networks and OoD detection module
    ├── architecture        # The folder containing all networks for student and teacher
        ├── ResNet.py       # ResNet networks
    ├── ood_detection.py        # The OoD detetction module of URSL
    ├── URSL.py     # The main module of URSL model

 ├── losses.py      # All losses which is used in URSL model
 ├── main.py        # The main function to run URSL model
 ├── test.py        # Train classifier on learned embedding to classify between classes
 ├── utils.py       # All useful functions in training procedures
```
In the following sections, we will first provide details about how to set up the dataset. Then the instructions for installing package dependencies, training, and testing are provided.

# Configuring the Dataset
In this paper, we have used [CIFAR10, CIFAR100](https://www.cs.toronto.edu/~kriz/cifar.html), [Tiny-ImageNet](https://www.kaggle.com/c/tiny-imagenet) datasets. The CIFAR10 and CIFAR100 are lightweight datasets and are automatically downloaded into `data/cifar10/` and `data/cifar100/`, respectively, whenever needed. However, the Tiny-ImageNet dataset needs to be manually downloaded and placed in `data/tiny-imagenet/`. Since the test images are not labeled, We use the validation set as the test set to evaluate the models. The following directory outline will show how to place this dataset properly:

```bash
root
 ├── data       # The folder containing all the data
    ├── train       # train part of tiny-imagenet dataset
        ├── n01443537       # class annotation name 
            ├── images      # The folder containing all images which belongs to parent folder class
                ├── n01443537_0.JPEG 
                ├── n01443537_1.JPEG 
                ├── .....
        ├── ....
    ├── val     # validation part of tiny-imagenet dataset
        ├── images      # The folder containing all images of tiny-imagenet validation part
            ├── val_0.JPEG
        ├── val_annotations.txt     # annotation for validation part of tiny-imagenet dataset
```

# Installing Prerequisites

The following packages are required:

- torch==1.11.0+cu113
- torchvision==0.12.0+cu113
- tensorboard==2.9.0
- numpy==1.22.3
- Pillow==9.1.0

# Training and Testing

URSL training contains two phases. In the first phase, you execute the `main.py` file, and the encoders of all tasks will be saved in the defined directory in the `experiments` folder. For the second phase, which has combined with the test phase, you execute `test.py`, which trains a classifier on the head of the last trained encoder and then finally reports the accuracy of the model.

## The first phase of training

Consider the following scenario:
The CIFAR100 dataset is the main dataset where only 5% of its data are labeled, and the CIFAR10 dataset is the peripheral dataset with a memory size of 500
To perform the training of its first phase, run the following command:

```bash
python main.py --experiment-name cifar100-5%-cifar10-mem500 --labeled-dataset-name cifar100 --unlabeled-dataset-name cifar10 --num-labeled-per-class 25 --class-per-task 10 --memory-size 500
```

For other settings you can simply change the above arguments.

<br>

## The second phase of the training and test phase

To train a classifier on head of trained encoders and to evaluate it you can run below command:

```bash
python test.py --experiment-name cifar100-5%-cifar10-mem500 --labeled-dataset-name cifar100  --class-per-task 10 --student-arch resnet18
```

Note that the `experiment-name` arguments in the second phase should be the same as in the first phase so that the code could be able to load trained encoders.

# References

* [Co<sup>2</sup>L](https://github.com/chaht01/Co2L)
* [CIFAR10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset
* [CIFAR100](https://www.cs.toronto.edu/~kriz/cifar.html) dataset
* [Tiny-Imagenet](https://github.com/DennisHanyuanXu/Tiny-ImageNet) dataset under [MIT License](https://github.com/DennisHanyuanXu/Tiny-ImageNet/blob/master/LICENSE)
