# Bridging Unsupervised and Semi-Supervised Anomaly Detection: A Theoretically-Grounded and Practical Framework with Synthetic Anomalies

Our results are in the Jupyter notebook files.
For reproducibility, you may also run this code.
To reproduce our results, follow the steps below:
1. Download required packages `pip install -r requirements.txt`
2. Download the corresponding datasets
(e.g., [NSL-KDD](https://web.archive.org/web/20150205070216/http://nsl.cs.unb.ca/NSL-KDD/) [1] 
KDDTrain+.TXT and KDDTest+.TXT) in a `data` folder that sits in the same directory as this folder, and change the `dataset_name` in run.py.
    - Note that for text and image datasets (AdvBench and MVTec), you will need to extract the embeddings first (which will be used as the data). We used Sentence-BERT and DINOv2.
3. Run the desired Jupyter notebook to obtain aggregated results.
    - For composite methods of unsupervised AD then supervised binary classification, you can run `python run_unsupAD_supBC.py` and change the variables in the file accordingly. These composite methods do not perform well, so we do not include larger Jupyter notebook files for them.
    - First set of aggregated results corresponds to threshold estimated at 5% threshold, 
while second set is at the middle threshold.
We only consider AUPR, so the threshold does not matter for the most part (i.e., use any set of results).

Due to space limitations, we only include code for our method.
Results in the Jupyter notebook files are also provided (except that we only provide results of 2 objects for MVTec. The remainder can be obtained by changing the `kwargs_data["obj"]` variable).
However, we do not provide the datasets nor the foundation models, but these are all open-source.
We also do not provide the notebook files for competing methods we evaluate, although those can be obtained by changing the `model_type` variable.


[1] Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A. Ghorbani. A detailed analysis of the KDD cup 99
data set. In 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications,
pages 1–6, 2009. doi: 10.1109/CISDA.2009.5356528.