# High-Dimensional Robustness Benchmark: Nested DRO vs. Baselines

This repository implements a comparative study of DRO methods on high-dimensional synthetic data. It evaluates how different algorithms perform as the feature dimension increases ($d=10 \to 50$) under complex data corruption scenarios.

The core experiment demonstrates the effectiveness of **Nested DRO** with a **Dynamic Epsilon Scheduler** compared to state-of-the-art baselines like OR-WDRO and UOT-DRO.

## The Dataset (High-Dim)

The experiment uses a custom `KillerDatasetHighDim` designed to break varying types of robust estimators. The data distribution consists of:

1.  **Normal Background (70%)**: Standard linear data $y = w^T x + b + \epsilon$.
2.  **Distant High Leverage (10%)**: Points far from the center ($10\times$ magnitude) but with **valid** labels. These are "good" leverage points that help estimation if correctly identified.
3.  **Proximal Outliers (20%)**: Points near the center but with **adversarial** labels (inverted weights + large bias). These are "bad" outliers designed to conflate with normal data in high dimensions.

## Algorithms Compared

The repository compares four methods:

1.  **Nested DRO (Proposed)**:
    * Uses a **Dynamic Epsilon Scheduler** based on the signal-to-noise ratio: $\epsilon_t \propto (\text{Mean Loss} / \text{Std Loss})^2$.
    * Optimizes a variance-regularized objective.
2.  **OR-WDRO**:
    * Uses a specialized trimming mechanism and robust mean/variance estimation.
    * *Includes a dedicated parameter search script.*
3.  **UOT-DRO**:
    * Relaxed marginal constraints using KL-divergence.
    * *Includes a dedicated parameter search script.*
4.  **Standard DRO**:
    * Baseline WDRO implementation using Huber loss and PGD (Projected Gradient Descent).

## File Structure

* **`synthetic_dims.py`** (Main Script):
    * Runs the full benchmark across dimensions $[10, 20, 30, 40, 50]$.
    * Trains all 4 models over multiple random seeds.
    * Generates the final plot `excess_risk_dimension_dynamic_ratio.pdf`.
* **`or_wdro_parameterSearching.py`**:
    * Performs grid search to find universal hyperparameters ($\epsilon$, $\sigma_{mult}$) for OR-WDRO across low and high dimensions.
* **`uot_dro_parameterSearching.py`**:
    * Performs grid search to find optimal hyperparameters ($\lambda$, $\beta$, $\lambda_2$) for UOT-DRO.

## Requirements

* Python 3.x
* PyTorch
* NumPy
* Matplotlib

```bash
pip install torch numpy matplotlib