# Gradient Manifold Geometry as a Signature for Adversarial Detection

This repository contains the official implementation for the paper, "**Gradient Manifold Geometry as a Signature for Adversarial Detection**". It includes the necessary code to reproduce all experiments presented in the paper, demonstrating our proposed method for detecting adversarial examples by analyzing the intrinsic dimensionality (ID) of model gradients.

## 1. Setup

### Requirements
To set up the environment, please install the required Python packages using the provided `requirements.txt` file:
```bash
pip install -r requirements.txt
```

### Datasets
The scripts will automatically download the **CIFAR-10**, **MNIST**, and **SVHN** datasets to a local `./data` directory upon first run.

For the **MS COCO** dataset (used in Experiment 3), please download the 2017 version and place it in the `./data` directory. The expected structure is:
```
data/
└── coco2017/
    ├── train2017/
    ├── val2017/
    └── annotations/
        ├── instances_train2017.json
        └── instances_val2017.json
```

## 2. Reproducing Experiments

The experiments are organized into three main scripts, each corresponding to a major section in the paper. The generated results (plots and logs) will be saved in the `results/` directory, and model checkpoints will be stored in `checkpoints/` to avoid re-training.

### Experiment 1: Batch-Wise Gradient Analysis (Section 5.1)

This experiment simulates the batch-wise client detection scenario. The script `run_exp1_batch_wise.py` will train a model on the specified dataset, simulate malicious and benign clients, and generate the corresponding plots.

**Usage:**
```bash
# To reproduce results for CIFAR-10:
python run_exp1_batch_wise.py --dataset cifar10

# To reproduce results for MNIST:
python run_exp1_batch_wise.py --dataset mnist

# To reproduce results for SVHN:
python run_exp1_batch_wise.py --dataset svhn
```

### Experiment 2: Individual Gradient Analysis (Section 5.2)

These experiments, conducted on the SVHN dataset, evaluate our method's ability to detect individual adversarial samples.

**Experiment 2a - ID Comparison Plot:**
This script generates the plots comparing the incremental ID of normal vs. adversarial samples for PGD and AutoAttack.
```bash
python run_exp2a_individual_comparison.py
```

**Experiment 2b - Detection and Histogram:**
This script calibrates the detection thresholds using a hold-out set and evaluates the final detection accuracy, generating the ID distribution histogram.
```bash
python run_exp2b_individual_detection.py
```

### Experiment 3: SOTA Comparison on CIFAR-10 & MS COCO (Section 5.3)

The `run_exp3_sota.py` script is designed to reproduce the state-of-the-art comparison results reported in Tables 2 and 3. The workflow involves two main steps:
1.  **Threshold Search (Optional):** Finds the optimal detection thresholds for a given attack on a small calibration set.
2.  **Full Evaluation:** Evaluates the detection rate on the entire test set using the found or provided thresholds.

**Usage:**

The script requires specifying both the dataset and the attack type.

**To run the full pipeline (threshold search + evaluation):**
This will first find the best thresholds and then use them for the final evaluation.
```bash
# Example for PGD on CIFAR-10
python run_exp3_sota.py --dataset cifar10 --attack PGD

# Example for CW on COCO
python run_exp3_sota.py --dataset coco --attack CW
```

**To run evaluation with pre-tuned thresholds (faster):**
If you have already determined the optimal thresholds, you can provide them directly to skip the search phase and reproduce the final table results quickly.
```bash
# Example for PGD on CIFAR-10 using pre-found thresholds
python run_exp3_sota.py --dataset cifar10 --attack PGD --low_thresh 4.8599 --high_thresh 4.8682
```
This allows for efficient and direct verification of the results reported in the paper.