# VC-BOD

This is the implemenation of the Paper Decoupling Dependency Structures: Sklar’s theorem for explainable outlier detection.

## Setup

You can either run the setup script or do it [manually](SETUP.md).

```bash
make setup
```

Make sure that all the datasets are pulled correctly. If not, they can manually be downloaded from the [ODDS](http://odds.cs.stonybrook.edu/) website.

## Experiments

[!NOTE]
Note that the core implementation of our algorithm is written in R-Studio. This is due to the fact that the c++ objects in the python implementation are not pickable, which would be needed to fully utilize multicores in python. Further the statistics community still havely relies on R.
The rest of the code is writting in python, mainly driving by its ease of use and adaptation through the ml research community.

Further we are using forking for the R Package. You can controll the amount of cores used with "-c" flag. On windows only 1 core works, so we highly recommand you run this on a unix based machine.

To replicate the results of our work you can run the following commands:

```bash
python run.py  -pd {PATH_TO_DAT_FOLDER} -sp results -md 3  -c 1 -dt 0.65 -dc 0.75
```

It is important to note, that due to computational constratins, we expluded the following datasets "fraud" and "mulcross" from the general experiments:
We re run those experiments with a training size of 0.1 and a test size of 0.5

```bash
python run.py  -pd {PATH_TO_DAT_FOLDER} -sp results -md 3  -c 1 -ds mullcross --train_size 0.1 --test_size 0.5 -dt 0.65 -dc 0.75
```

```bash
python run.py  -pd {PATH_TO_DAT_FOLDER} -sp results -md 3  -c 1 -ds fraud --train_size 0.1 --test_size 0.5 -dt 0.65 -dc 0.75
```
