## One-shot Active Learning based on Lewis Weight Sampling for Multiple Deep Models

Authors: Anonymous

Implementation of the paper "One-shot Active Learning based on Lewis Weight Sampling for Multiple Deep Models". This project is based on [DIAM](https://github.com/tangypnuaa/DIAM) project. This work uses maximum Lewis weight among multiple deep models to sample and reweight the unlabeled instances. The repository include the code of training/testing models, active data selection of all compared methods. (**This work is under double-blind review, please do not distribute.**)

## Main requirements

- horovod
- tqdm
- alipy
- torch
- torchvision
- pillow
- scikit-learn
- filelock
- torchmetrics
- fastai

## Computational Resources

We run our experiments on 2 cloud servers, each of them has 128GB memory and 4 RTX 3090 graphic cards. The CPU is Intel  Xeon Gold 5317 CPU.  Since we run each of the compared method on one graphic card, respectively, we report the resource occupation of each individual process. The minimum requirement to train and validate the model is 16GB memory and 7GB CUDA memory with training batch size 64, respectively. If running the coreset query method, 10GB extra memory is needed to store the distance matrix. 

## Pre-trained models

We employ 50 distinct network architectures from a recent NAS work [once-for-all](https://github.com/mit-han-lab/once-for-all/) as our target models. These architectures are published for accommodating diverse resource-constraint devices, ranging from NVIDIA Tesla V100 GPU to mobile devices. It aligns well with our problem setting. The model specifications can be found at load_test_models.py. Running the code will automatically download and apply the pre-trained weights provided by [once-for-all](https://github.com/mit-han-lab/once-for-all/).

## Usage

##### Scripts to Run the Experiments

To run our method in classification datasets
```
# usage: bash iter_our.sh [gpu_id] [number_of_target_models] [query_budget]
bash iter_all.sh 0 50 12000
```

To run the other compared methods in classification datasets
```
# usage: bash iter_cla.sh [gpu_id] [number_of_target_models]
bash iter_cla.sh 0 50
```

To run all methods in regression datasets
```
# usage: bash iter_reg.sh [gpu_id] [number_of_target_models]
bash iter_reg.sh 0 50
```

##### Train Model

```
# usage: python train_net.py --net_id [0, ..., 50] --dataset [mnist, ..., flw] --al_iter [0, ..., 10] --method [entropy, ..., qbc]
python train_net.py --net_id 0 --dataset mnist --al_iter 0 --method entropy
```

##### Active Query

```
# usage: python al_select.py --method [$METHOD] --dataset [$DATASET] --al_iter [int: 0, 1, 2, ..., 10] --batch_size [active query batch size] --model_num [number_of_target_models]
python al_select.py --method entropy --dataset mnist --al_iter 0 --batch_size 3000 --model_num 50
```

##### Test Models

```
# usage: python test_multi_models.py --dataset [$DATASET] --method [$METHOD] --al_iter [$ITER] --model_num [number_of_target_models]
python test_multi_models.py --dataset mnist --method entropy --al_iter 0 --model_num 50
```


## Running time

In our code, 50 deep models are sequentially trained with 20 epochs. This will take about 5 hours in a 3090 graphic cards. The active learning process is relatively fast. Entropy and QBC take about 30-60 minutes for one iteration of data selection, depending on the size of the unlabeled set. Coreset takes about 10-30 minutes; DIAM is about 10 times expensive than Entropy, as it needs much more models to predict the unlabeled instances. As for our method, it requires 1-4 hours to conduct one-shot querying, as calculating the Lewis weight is expensive. However, our method only trains the models once. Therefore, our total running time is less than the others significantly.
