# Learning-to-Rank for AutoML: A Simple and Robust Alternative to Score-Based Pipeline Selection

This repository contains code to reproduce the experiments for evaluating ranking-based meta-learning strategies for automated machine learning (AutoML). It includes benchmarking models, score vs. rank comparisons, and robustness analyses under different noise levels.

---

## 📁 Project Structure

```
autofolio/
│
├── 1-Download_data.ipynb         # Script to download meta-datasets
├── 2-Execute_experiments.ipynb   # Run meta-learning experiments
├── 3-Summarize_and_visualize_results.ipynb
│                               
├── 4-AutoFolio_experiments.ipynb # Run AutoFolio experiment
├── 5-AutoFolio_results.ipynb     # Summarize_and_visualize AutoFolio results
├── 6-Robustness-experiment.ipynb
│
├── columns.py                    # Utility for managing dataset column names
├── ltr_utils.py                  # Util Functions for experiments
├── metrics.py                    # Evaluation metrics (e.g., NDCG, MRR, etc.)
├── models.py                     # Baseline and LTR model definitions
├── summarize.py                  # Helper functions to aggregate results
├── visualization.py              # Plotting functions
│
├── LICENSE.txt
├── requirements.txt
└── README.md
│
├── autofolio/                    # Pre-computed data corresponding to number 2 notebook experiment
├── benchmarking/                 # Pre-computed data corresponding to number 4 notebook experiment
├── grammar/                      # Configuration space grammars to compute pipeline information
```

---

## 🚀 Getting Started

### Python Versions

- Most experiments require **Python 3.9**.
- AutoFolio-specific experiments require **Python 3.5** due to library compatibility.

### Installation

Create a virtual environment and install dependencies:

```bash
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

---

## 📊 Experiments

The project is structured into modular notebooks:

- **Step 1:** `1-Download_data.ipynb`  
  Downloads and prepares OpenML and AutoFolio benchmark data.

- **Step 2:** `2-Execute_experiments.ipynb`  
  Runs score-based and rank-based meta-learning experiments.

- **Step 3:** `3-Summarize_and_visualize_results.ipynb`  
  Aggregates metrics (NDCG, MRR, Kendall's Tau) and visualizes results.

- **Step 4-5:** AutoFolio-specific experiments and results.

- **Step 6:** `6-Robustness-experiment.ipynb`  
  Evaluates robustness to Gaussian noise in meta-data.

---

## 📈 Metrics

We report two types of metrics:

- **Ranking Metrics:**
  - **NDCG** (Normalized Discounted Cumulative Gain)
  - **MRR** (Mean Reciprocal Rank)
  - Evaluated at cutoff positions $k \in \{1, 5, 10\}$ (or $\{1, 10, 100\}$ for AMLB scenarios).

- **Correlation Metrics:**
  - **Kendall's Tau**
  - **Spearman's Rho**
  - Used to assess global ranking agreement.

- **System-Level Metrics:**
  - **SCORE:** Average of the target objective (e.g., accuracy or error).
  - **TTB (Time-To-Best):** Time required to find the best pipeline.
  - **AVG_RANK:** Average rank difference between score- and rank-based methods.

---

## 🧪 Reproducibility

The code is designed for deterministic evaluation via seeds.  
To reproduce the key experiments, follow notebooks in sequence.

### AutoFolio Reproducibility Notes

For the AutoFolio experiments, we modified two components:

1. The `autofolio.py` main class from the AutoFolio library, to return the ID of the winning pipeline (instead of only the score).
2. The `ASLibScenario` class from the ASLib library, to fix a bug that occurred when reading the test CSV file.

The modified versions of both files are included in the `autofolio/` directory of this repository.

---

## 📄 License

This project is licensed under the Apache License, Version 2.0 (January 2004). See `LICENSE.txt` or visit [apache.org/licenses](http://www.apache.org/licenses/) for details.

---
