# TRACE Deployment Gate

A clean implementation of the TRACE (Theoretical Risk Attribution under Covariate-shift Effects) framework for deployment gate decisions in domain adaptation scenarios.

## Overview

This repository contains the complete implementation and results for the TRACE deployment gate experiment on DomainNet, achieving exceptional performance:
- **Spearman ρ = 0.944** (95% CI: [0.793, 0.989])
- **Kendall τ = 0.832** (95% CI: [0.665, 0.945])
- **AUROC ≈ 1.0, AUPRC ≈ 1.0** across practical thresholds

## Repository Structure

```
TRACE_deployment_gate/
├── main.py                              # Main orchestration script
├── training.py                          # Source and candidate model training
├── dataset_loader.py                    # Data loading and preprocessing
├── model_factory.py                     # Model creation utilities
├── analyze_results.py                   # Generates result tables and plots
├── configs/
│   └── domainnet.yaml                   # Main experiment configuration
├── metrics/
│   ├── trace.py                         # Core TRACE implementation
│   ├── divergences.py                   # MMD and transport metrics
│   ├── model_change.py                  # Output discrepancy metrics
│   ├── ood_scores.py                    # OOD detection baselines
│   └── shift.py                         # Shift detection metrics
```

## Setup and Execution

### 1. Installation
First, clone the repository and install the required Python packages. We recommend using a virtual environment.
```bash
# Clone the repository
git clone [URL_TO_YOUR_REPO]
cd TRACE_deployment_gate

# Create and activate a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt
```

### 2. Download Data
Next, download and prepare the DomainNet dataset. You will need to manually download the `real` and `sketch` domains.
```bash
# Create the data directory
mkdir -p data
cd data

# Download the zip files (approx. 1.5 GB total)
wget http://csr.bu.edu/ftp/visda/2019/multi-source/groundtruth/real.zip
wget http://csr.bu.edu/ftp/visda/2019/multi-source/groundtruth/sketch.zip

# Unzip the files
unzip real.zip
unzip sketch.zip

# Clean up zip files
rm real.zip sketch.zip

# Go back to the root directory
cd ..
```

### 3. Run the Full Experiment
Run the `main.py` script to train the source model, all 20 candidates, and perform the deployment gate analysis. The results will be saved to `experiments/deployment_gate/results/deployment_gate_results.csv`.
```bash
# This may take a significant amount of time and requires a GPU.
# The --train-source and --train-candidates flags ensure all models are trained from scratch.
python main.py \
    --config configs/domainnet.yaml \
    --train-source \
    --train-candidates
```

### 4. Analyze Results
After the main script finishes, you can generate the correlation tables and plots from the paper using the analysis script.
```bash
# Analyze the main results
python analyze_results.py --csv experiments/deployment_gate/results/deployment_gate_results.csv --outdir experiments/deployment_gate/results/analysis
```

### 5. Run on New Dataset
```bash
# Create your config file (see SETUP_GUIDE_New_Dataset.md)
python main.py --config configs/your_dataset.yaml
```

## Key Results

### DomainNet (real → sketch) Performance
| Metric | TRACE Score | Output Discrepancy | MMD Distance | W1 Distance |
|--------|-------------|-------------------|--------------|-------------|
| Spearman ρ | **0.944** | **0.944** | NaN (constant) | -0.048 |
| Kendall τ | **0.832** | **0.832** | NaN (constant) | -0.021 |
| AUROC | **1.00** | **1.00** | 0.50 (random) | 0.50 (random) |
| AUPRC | **1.00** | **1.00** | 0.95 (high prevalence) | 0.95 (high prevalence) |

### Key Insights
1. **Output discrepancy dominates**: The model change component (ρ = 0.944) is the primary signal
2. **Distance components are constant**: Both MMD and W1 show minimal variation across candidates
3. **TRACE provides perfect ranking**: Near-perfect correlation with true risk differences
4. **MMD and W1 are equivalent**: When distance components are constant, both provide the same diagnostic value

## Technical Details

### TRACE Framework Components
- **Output Discrepancy**: L2 distance between model logits on target data
- **Covariate Shift**: MMD or W1 distance in frozen ImageNet feature space
- **Lipschitz Proxies**: 99th percentile of input gradient norms
- **Generalization Gaps**: Validation-based estimates (omitted in gate mode)

### Model Training
- **Source Model**: ResNet-50 trained on 60% of source domain
- **Candidates**: Fine-tuned for 1-3 epochs on target domain
- **Hyperparameters**: Random selection for diversity
- **Architecture**: ImageNet-pretrained ResNet-50 with custom final layer

### Data Splits
- **Source Domain**: 60% train, 20% val, 20% test
- **Target Domain**: 100% for training and evaluation
- **Evaluation**: Always on source test set (anchor domain)

## Dependencies

```bash
pip install torch torchvision
pip install pandas matplotlib seaborn scipy scikit-learn
pip install geomloss  # Optional, for Sinkhorn OT
pip install tqdm
```

## Configuration

Key parameters in `configs/domainnet.yaml`:
```yaml
source_domain: "real"
target_domain: "sketch"
num_classes: 345
batch_size: 64
learning_rate: 0.0001
max_epochs: 10
num_candidates: 20
seed: 42
```

## Reproducibility

All results are fully reproducible with:
- Fixed random seeds (seed=42)
- Deterministic data splits
- Consistent preprocessing
- Version-controlled configurations

## Documentation

- **`TECHNICAL_REPORT_DomainNet_Deployment_Gate.md`**: Complete technical documentation
- **`SETUP_GUIDE_New_Dataset.md`**: Step-by-step guide for new datasets
- **`results/mmd_vs_w1_analysis/`**: Detailed MMD vs W1 comparison

