# DiSPaT: f-Divergence Self-Play for Tabular Anomaly Detection via Large Language Models

## Introduction

We propose DiSPaT, a self-play fine-tuning framework that strengthens the model's understanding of normal data. Building on the theoretical foundation of $f$-divergence minimization, we derive a tight approximation connecting our training objective to reducing the distributional gap between real normal data and model-generated samples. DiSPaT operates through an alternating optimization: at each iteration, the current policy generates synthetic samples that serve as pseudo-anomalies, while a critic discriminator learns to distinguish these from real normal samples; this signal drives policy updates that progressively align the model distribution with the true normal-data distribution.

## Installation

### Prerequisites

- Python 3.10
- CUDA 11.8 or 12.x

### Step 1: Create Conda Environment

```bash
conda env create -f enviroment.yaml
conda activate dispat
```

### Step 2: Install PyTorch

**For CUDA 12.x:**
```bash
pip install torch==2.9.0 torchvision==0.24.0 torchaudio==2.9.0 --index-url https://download.pytorch.org/whl/cu121
```

**For CUDA 11.x:**
```bash
pip install torch==2.9.0 torchvision==0.24.0 torchaudio==2.9.0 --index-url https://download.pytorch.org/whl/cu118
```

### Step 3: Install Dependencies

```bash
pip install -r requirements.txt
```

## Quick Start

### 1. Train SFT Model

```bash
python SFT/train_anollm.py \
    --dataset vifd \
    --model smol-360 \
    --max_steps 2000 \
    --batch_size 16 \
    --binning standard \
    --setting semi_supervised \
    --n_splits 1 \
    --split_idx 0
```

### 2. Run DiSPaT Pipeline

```bash
bash scripts/main_results/run_dispat_vifd.sh
```

### 3. Evaluate Model

```bash
python DiSPaT/evaluate_dispat.py \
    --dataset vifd \
    --exp_dir exp_dispat/360M/vifd/split0 \
    --model smol-360 \
    --binning standard \
    --n_splits 1 \
    --split_idx 0 \
    --setting semi_supervised
```

## Running Experiments

### Main Results

```bash
bash scripts/main_results/run_dispat_vifd.sh
bash scripts/main_results/run_dispat_fakejob.sh
bash scripts/main_results/run_dispat_seismic.sh
bash scripts/main_results/run_dispat_ecoli.sh
bash scripts/main_results/run_dispat_lymphography.sh
bash scripts/main_results/run_dispat_20news.sh
```

GPU IDs can be configured via environment variables `TRAIN_GPUS` and `INFERENCE_GPUS` (default: "0").

### Clamping Threshold Experiments

```bash
bash scripts/clamping_threshold/run_dispat_epsilon.sh
```

### f-Divergence Experiments

```bash
bash scripts/f_divergence/run_dispat_fdiv.sh
```
## Key Parameters

### DiSPaT Training Parameters

- `--beta`: Temperature parameter (default: 0.1)
- `--epsilon`: Clamping threshold for domain constraints (default: 0.02)
- `--f_divergence_type`: Type of f-divergence (`identity`, `kl`, `reverse_kl`, `squared_hellinger`)
- `--iteration`: Iteration number (0, 1, 2, ...)
- `--max_steps`: Maximum training steps per iteration

### Model Parameters

- `--model`: Model size (`smol`, `smol-360`, `smol-1.7b`)
- `--batch_size`: Training batch size (default: 16-32)
- `--lr`: Learning rate (default: 5e-5)
- `--binning`: Numerical binning method (`standard`, `quantile`, `equal_width`)

## Evaluation Metrics

Results are evaluated using:
- **AUC-ROC**: Area under the ROC curve
- **AUC-PR**: Area under the Precision-Recall curve
- **F1-Score**: F1 score at optimal threshold
- **Precision/Recall**: Precision and recall at optimal threshold


## 📜 License

This project is licensed under the Apache License 2.0 - see the [LICENSE](LICENSE) file for details.