<div align="center">

# Meta Flow Matching

## Description

<div align="left">

**Meta Flow Matching (MFM)** is a practical approach to integrating along vector fields on the Wasserstein manifold by amortizing the _flow_ model over the initial distributions. Current flow-based models are limited to a single initial distribution/population and a set of predefined conditions which describe different dynamics.

In natural sciences, multiple processes can be represented as vector fields on the Wasserstein manifold of probability densities - i.e. the change of the population at any moment in time depends on the population itself due to the interactions between samples/particles. One domain of applications is personalized medicine, where the development of diseases and the respective effect/response of treatments depend on the microenvironment of cells specific to each patient.

In MFM, we jointly train a vector field model $v_t(\cdot | \varphi(p_0; \theta); \omega)$ and a population embedding model $\varphi(p_0; \theta)$. Initial populations are embedded into lower dimensional representations using a Graph Neural Network (GNN). This gives MFM the ability to generalize over unseen distributions, unlike previously proposed methods. We show the ability of MFM to improve prediction of individual treatment responses on a [large-scale multi-patient single-cell drug screen dataset (Ramos Zapatero et al. Cell, 2023)](https://www.cell.com/cell/pdf/S0092-8674(23)01220-5.pdf).

The raw data can be downloaded here: [Raw organoid data](https://data.mendeley.com/datasets/hc8gxwks3p/1). For usability, we provide the notebook [trellis_data.ipynb](notebooks/trellis_data.ipynb) which contains further dataset details and code for the data preprocessing.

## How to run

Install dependencies

```bash
# [OPTIONAL] create conda environment
conda create -n mfm python=3.9
conda activate mfm

# install pytorch according to instructions
# https://pytorch.org/get-started/

# install requirements
pip install -r requirements.txt
```

Train model with chosen experiment configuration from [src.conf/experiment/](src/conf/experiment/)

```bash
python train.py experiment=experiment_name.yaml
```

You can override any parameter from command line like this

```bash
python train.py experiment=experiment_name.yaml trainer.max_epochs=1234 seed=42
```

To train a model via MFM on the synthetic letters setting, use

```bash
python train.py experiment=letters_mfm.yaml
```

To run the biological experiments, first download the data and use the [trellis_data.ipynb](notebooks/trellis_data.ipynb) to preprocess the data. Then, similar to the synthetic letters experiment, executing

```bash
python train.py experiment=trellis_mfm.yaml
```

will train 1 seed of an MFM model on the organoid drug-screen dataset.

To replicate an experiment, for example, the last row of Table 1 (in the paper), you can use the multi-run feature:

```bash
python train.py -m experiment=letters_mfm.yaml seed=1,2,3
```
