
# Towards Decision-Focused Learning for Sparse and Weakly Supervised Environments

A framework for meta-learning in trip planning scenarios with sparse and weakly supervised feedback. Supports both synthetic and real-world datasets with configurable hyperparameters and Optuna-based tuning.

This `main_experiments` folder contains the **experiments and code that appear in the main paper results**.  
Additional exploratory experiments on classical LP (AFIRO) and knapsack benchmarks are provided separately in:

`ICLR_submission/additional_experiments/ReadMe.md`

---

## Setup

**Python**: Compatible with Python 3.12.10

1. Create and activate a virtual environment:
   ```bash
   python -m venv venv
   source venv/bin/activate  # Linux/MacOS
   venv\Scripts\activate     # Windows
   ```

2. Install dependencies:
   ```bash
   pip install -r requirements.txt
   ```

---

## Dataset Structure

### Synthetic Data
- Generated using `src/data_loader/data_generation_script.py`
- Stored in `datasets/synthetic_data/`
- Configuration files for generation parameters are in `datasets/synthetic_data/config/`

### Real Feedback Data
- The data is anonymised and normalised and stored in `datasets/real_feedback/filtered.csv`
- **Columns**:
  - `main_vector`: Suggested path (features: `[realtime/static, travel time, transfers, walking time, walking edges]`)
  - `query_response`: Candidate paths (same features as `main_vector`)
  - `feedback`: User response (`positive` or `negative`)

Example Data Snippet:

| user_id | main_vector                        | query_response                                                  | feedback |
|---------|------------------------------------|------------------------------------------------------------------|----------|
| 15      | `['0', '0.09', 1.0, '0.02', 1]`    | `[['0', '0.13', 0.666..., '0.05', '0'], ...]`                    | positive |

---

## Usage

### 1. Run with Best Hyperparameters
**Real Dataset** (Limit=2):
```bash
python src/trip_planning.py --dataset real --input_dir ./datasets/real_feedback/ --limit 2 --use_default_params
```

**Synthetic Dataset** (Limit=5):
```bash
python src/trip_planning.py --dataset synthetic --input_dir ./datasets/synthetic_data/ --limit 5 --use_default_params
```

### 2. Hyperparameter Tuning with Optuna
```bash
python src/trip_planning.py --optuna --dataset {real/synthetic} --input_dir {dataset_path} --trials {num_trials}
```
Example:
```bash
python src/trip_planning.py --optuna --dataset real --input_dir ./datasets/real_feedback/ --trials 50
```

---

## Configuration
- **`config.py`**: Contains pre-tuned hyperparameters for both datasets:
- Use `--param_string` to override parameters with a custom string.

---

## Project Structure
```
main_experiments/
├── datasets/
│   ├── real_feedback/
│   │   ├── filtered.csv (normalised dataset used in training)
│   │   └── parsed_data.csv (raw anonymised feedback dataset for reference)
│   └── synthetic_data/
│       ├── df_data.csv
│       ├── df_interaction_map.csv
│       ├── df_od_map.csv
│       ├── df_users.csv
│       └── config
├── src/
│   ├── data_loader/
│   │   ├── data_generation_script.py       # Synthetic data generation
│   │   ├── synthetic_data_loader.py        # Synthetic data loading
│   │   └── real_feedback_data_loader.py    # Real data loading
│   ├── trip_planning.py                    # Main entrypoint
│   ├── training.py                         # Training Loop
│   ├── utils.py                            # Supporting Functions
│   ├── config.py                           # Hyperparameters
│   ├── search.py                           # Optuna tuning
│   └── validate.py                         # Single-run validation
├── requirements.txt
└── ReadMe.md
```

---

## Notes
- The `--limit` flag controls the number of training points per user (Ensure that the parameter is not greater than the total data points available for a user, especially for real-feedback ata).
- Optuna trials explore hyperparameters like learning rates, regularisation weights, and dynamic neighbour policies.
