# Additional Experiments (ICLR Submission)

This folder contains **exploratory experiments** that are **not part of the main submission results**. They are used to probe how our weakly supervised decision-focused framework behaves on classical benchmark problems, and to compare it qualitatively against traditional decision-focused learning (DFL) methods such as SPO+.

These experiments are **illustrative**, not fully tuned or finalised. Numbers here should be interpreted as **sanity checks and qualitative comparisons**, not as definitive benchmarks.

---

## Overview

We include two additional problem classes:

1. **Linear Programming (LP) – AFIRO benchmark**  
   A small linear program from the NETLIB collection (commonly referred to as “AFIRO”). We use it as a transparent, well-understood LP to study bias recovery and regret convergence of our weakly supervised DFL framework.

2. **0–1 Knapsack**  
   A combinatorial optimisation problem where we must select items under a capacity constraint to maximise total value. This knapsack setting is adapted from the benchmark suite used in:

> Mandi, J., Kotary, J., Berden, S., Mulamba, M., Bucarey, V., Guns, T., & Fioretto, F. (2024).  
> *Decision-Focused Learning: Foundations, State of the Art, Benchmark and Future Opportunities*.  
> Journal of Artificial Intelligence Research, 80.  
> https://doi.org/10.1613/jair.1.15320

Our framework is a **weakly supervised reduction of DFL**, so its performance is **not expected to match** fully supervised methods such as SPO+ or other traditional DFL baselines. Instead, these experiments show how close we can get in terms of regret under weaker feedback assumptions.

---

## Code Layout

Within `ICLR_submission/additional_experiments/`:

- **Linear Programming (AFIRO)**
  - `LO/weak_dfl.py`  
    Main script for the AFIRO LP experiment.  
    Runs weak DFL and (optionally) a SPO+ baseline on the NETLIB AFIRO instance (`datasets/lp_afiro.mat`), then plots the regret trajectories.

- **Knapsack**
  - `knapsack/weak_knapsack_v2.py`  
    Main script for the 0–1 knapsack experiment.  
    Trains:
    - A **weak DFL** model using preference-based feedback over candidate solutions, and
    - A **SPO+** baseline using fully supervised access to item values.

    The script produces a **single comparison plot** of validation regret over epochs (weak DFL vs SPO+) and prints summary statistics.

- **Shared Utilities and Data**
  - `utils/common_utils.py` – shared helpers (random state control, regret computation, dataset generation, etc.).
  - `utils/minimisation_utils.py` – LP analysis utilities (nearby vertices, AFIRO MPS loading, etc.).
  - `datasets/` – shared problem data:
    - `lp_afiro.mat` – NETLIB AFIRO LP instance.
    - `Data.npz` – knapsack dataset (item weights, features, and values).

---

## Running the Experiments

All additional experiments are run from the project root. We assume you are using the same environment as the ICLR main experiments (here we show a conda-based workflow, but any equivalent environment with the required packages is fine).

### 1. Environment and Setup

Environment creation and installation instructions are shared with the main ICLR experiments.  
Please follow the steps in:

`ICLR_submission/main_experiments/ReadMe.md`

The additional experiments use the same dependency set (see `requirements.txt` there).

---

### 2. Linear Programming – AFIRO (Weak DFL + SPO+)

From the repository root:

```bash
conda activate dfl
python ICLR_submission/additional_experiments/LO/weak_dfl.py
```

This will:

- Load `datasets/lp_afiro.mat` (AFIRO LP).  
- Generate a synthetic dataset of feature–cost pairs correlated with the base cost vector.  
- Train the **weak DFL** model with the current hyperparameters.  
- Optionally train a **SPO+** baseline (controlled inside `weak_dfl.py`).  
- Plot a **regret vs. epoch** curve for weak DFL (and SPO+ if enabled).

The AFIRO example is chosen because it is small, interpretable, and widely used as a pedagogical example in LP literature.

---

### 3. Knapsack – Weak DFL vs SPO+

From the repository root:

```bash
conda activate dfl
python ICLR_submission/additional_experiments/knapsack/weak_knapsack_v2.py
```

This script will:

1. Load the knapsack data from `datasets/Data.npz`.
2. Construct an OR-Tools-based knapsack solver (SCIP backend).
3. Train:
   - **Weak DFL (preference-based)** using labels derived from utilities over nearby feasible solutions.
   - **SPO+ (fully supervised)** using the canonical SPO+ loss and gradient as in Mandi et al. (2024).
4. Produce a **single plot** comparing validation regret over epochs for weak DFL vs SPO+.

These comparisons highlight how our weakly supervised reduction behaves relative to a fully supervised SPO+ baseline on a standard knapsack benchmark.

---

## Notes and Caveats

- **Reproducibility:**  
  Scripts use fixed random seeds and consistent dataset generation procedures, but some components (e.g., OR-Tools solver behaviour or low-level tensor kernels) may still introduce small run-to-run variations.

- **Scope:**  
  These additional experiments are **not referenced in the main ICLR submission text**. They are provided as supplementary material for readers interested in how the proposed framework behaves on classical LP and knapsack benchmarks.
