# Supplementary material of "COPA: Comparing the incomparable in multi-objective model evaluation"

This folder contains all the necessary material to reproduce all experiments from the submission 13217 of NeurIPS 2025 titled "COPA: Comparing the incomparable in multi-objective model evaluation".

This material is for an official submission under review, *do not distribute under any circunstances*.

## Installation and running

To install all dependencies, we recommend using `uv` (see [here](https://docs.astral.sh/uv/getting-started/installation/#installation-methods) for more details), which can be installed in mac/Linux by simply running:

```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
```

Then, go to this folder within the terminal, install all depedencies by running:

```bash
uv sync
```

And start an instance of Jupyter Lab with:

```bash
uv run jupyter-lab
```

## Notebooks description

- Figures 1 and 10: `llm-fig1.ipynb`

- Figures 2, 3, and 9: `synthetic-case.ipynb`

- Figure 4: `llm-fig4.ipynb`

- Figures 5 and 13: `fairgrad-case.ipynb`

- Figures 6 and 14a: `mtl.ipynb`

- Figures 7 and 14b: `domain-generalization.ipynb`

- Figure 11: `llm-piecewise-criterion.ipynb`

- Figure 12: `llm-decodingtrust.ipynb`

- Figure 15a: `domain-generalization-appendix-1.ipynb`

- Figure 15b: `domain-generalization-appendix-2.ipynb`

**Note:** We have decided to release the notebook for the AMLB figures *only* once we get the explicit approval from the original authors, as the changes required to reproduce our figures from their notebooks are *minimal* (i.e., defining the same functions we use in all notebooks and plug them in).
