# Deep Positive-Unlabeled Anomaly Detection for Contaminated Unlabeled Data
This is a pytorch implementation of the submitted paper.


## Prerequisites
- Please see `requirements.txt`


## Datasets
- All datasets except MVTec will be downloaded when first used.
- Please download the MVTec dataset from the following URL, extract it, and place it under `datasets/mvtec_anomaly_detection`:
  - https://www.mvtec.com/company/research/datasets/mvtec-ad


## Usage

### for MNIST, FashionMNIST, SVHN and CIFAR10
```
usage: main-cnn.py [-h] [--dataset DATASET] [--normal_class NORMAL_CLASS]
                   [--unseen_anomaly UNSEEN_ANOMALY] [--algorithm ALGORITHM]
                   [--alpha ALPHA] [--n_epoch N_EPOCH]
                   [--learning_rate LEARNING_RATE] [--batch_size BATCH_SIZE]
                   [--seed SEED]
```
- You can choose the `dataset` from following datasets: 
  - `MNIST`: handwritten digits
  - `FashionMNIST`: fashion product images
  - `SVHN`: house number digits
  - `CIFAR10`: animal and vehicle images
- You can choose the `normal_class` from 0 to 9
- You can choose the `unseen_anomaly` from 0 to 9
- You can choose the `algorithm` from following algorithms:
  - `IF`: Isolation Forest
  - `AE`: Autoencoder
  - `DeepSVDD`: DeepSVDD
  - `LOE`: Latent Outlier Exposure
  - `ABC`: Autoencoding Binary Classifier
  - `DeepSAD`: Deep Semi-Supervised Anomaly Detection
  - `SOEL`: Semi-supervised Outlier Exposure with a Limited labeling budget
  - `PU`: PU Learning Classifier
  - `PUAE`: Our approach with AE
  - `PUSVDD`: Our approach with DeepSVDD
- You can change the `alpha`, the hyperparameter of PU learning-based approaches
- You can change the random `seed` of the training and `n_epoch`, `learning_rate`, and `batch_size` of the optimizer


### for CIFAR100
```
usage: main-cifar100.py [-h] [--algorithm ALGORITHM] [--alpha ALPHA]
                        [--n_epoch N_EPOCH] [--learning_rate LEARNING_RATE]
                        [--batch_size BATCH_SIZE] [--seed SEED]
```
- You can choose the `algorithm` from following algorithms:
  - `IF`: Isolation Forest
  - `AE`: Autoencoder
  - `DeepSVDD`: DeepSVDD
  - `LOE`: Latent Outlier Exposure
  - `ABC`: Autoencoding Binary Classifier
  - `DeepSAD`: Deep Semi-Supervised Anomaly Detection
  - `SOEL`: Semi-supervised Outlier Exposure with a Limited labeling budget
  - `PU`: PU Learning Classifier
  - `PUAE`: Our approach with AE
  - `PUSVDD`: Our approach with DeepSVDD
- You can change the `alpha`, the hyperparameter of PU learning-based approaches
- You can change the random `seed` of the training and `n_epoch`, `learning_rate`, and `batch_size` of the optimizer


### for medical images
```
usage: main-medical.py [-h] [--dataset DATASET] [--normal_class NORMAL_CLASS]
                       [--unseen_anomaly UNSEEN_ANOMALY]
                       [--algorithm ALGORITHM] [--alpha ALPHA]
                       [--n_epoch N_EPOCH] [--learning_rate LEARNING_RATE]
                       [--batch_size BATCH_SIZE] [--seed SEED]
```
- You can choose the `dataset` from following datasets: 
  - `Path`: colorectal cancer histology dataset with 9 tissue types
  - `OCT`: retinal optical coherence tomography dataset with 4 diagnostic categories
  - `Tissue`: kidney cortex cell dataset with 8 categories
- You can choose the `normal_class` from:
  - `Path`: `[0, 1, 2, 3, 4, 5, 6, 7, 8]`
  - `OCT`: `[3]`
  - `Tissue`: `[0, 1, 2, 3, 4, 5, 6, 7]`
- You can choose the `unseen_anomaly` from:
  - `Path`: `[0, 1, 2, 3, 4, 5, 6, 7, 8]`
  - `OCT`: `[0, 1, 2]`
  - `Tissue`: `[0, 1, 2, 3, 4, 5, 6, 7]`
- You can choose the `algorithm` from following algorithms:
  - `IF`: Isolation Forest
  - `AE`: Autoencoder
  - `DeepSVDD`: DeepSVDD
  - `LOE`: Latent Outlier Exposure
  - `ABC`: Autoencoding Binary Classifier
  - `DeepSAD`: Deep Semi-Supervised Anomaly Detection
  - `SOEL`: Semi-supervised Outlier Exposure with a Limited labeling budget
  - `PU`: PU Learning Classifier
  - `PUAE`: Our approach with AE
  - `PUSVDD`: Our approach with DeepSVDD
- You can change the `alpha`, the hyperparameter of PU learning-based approaches
- You can change the random `seed` of the training and `n_epoch`, `learning_rate`, and `batch_size` of the optimizer


### for MVTec
```
usage: main-mvtec.py [-h] [--algorithm ALGORITHM] [--alpha ALPHA]
                     [--n_epoch N_EPOCH] [--learning_rate LEARNING_RATE]
                     [--batch_size BATCH_SIZE] [--seed SEED]
```
- You can choose the `algorithm` from following algorithms:
  - `SOEL`: Semi-supervised Outlier Exposure with a Limited labeling budget
  - `PUSVDD`: Our approach with DeepSVDD
- You can change the `alpha`, the hyperparameter of PU learning-based approaches
- You can change the random `seed` of the training and `n_epoch`, `learning_rate`, and `batch_size` of the optimizer


### for toy dataset
```
usage: toy-deep.py [-h] [--algorithm ALGORITHM] [--alpha ALPHA]
                   [--n_epoch N_EPOCH] [--learning_rate LEARNING_RATE]
                   [--batch_size BATCH_SIZE] [--seed SEED]
```
- You can choose the `algorithm` from following algorithms:
  - `AE`: Autoencoder
  - `DeepSVDD`: DeepSVDD
  - `LOE`: Latent Outlier Exposure
  - `ABC`: Autoencoding Binary Classifier
  - `DeepSAD`: Deep Semi-Supervised Anomaly Detection
  - `SOEL`: Semi-supervised Outlier Exposure with a Limited labeling budget
  - `PU`: PU Learning Classifier
  - `PUAE`: Our approach with AE
  - `PUSVDD`: Our approach with DeepSVDD
- You can change the `alpha`, the hyperparameter of PU learning-based approaches
- You can change the random `seed` of the training and `n_epoch`, `learning_rate`, and `batch_size` of the optimizer


## Example
MNIST experiment (normal: 1 / unseen: 0) with our approach:
```
python main-cnn.py --dataset MNIST --normal_class 1 --unseen_anomaly 0 --algorithm PUSVDD
```
