# Supplementary Code

Code supplement for the NeurIPS 2025 paper `When and how can inexact generative models still sample from the data manifold?'

To generate the plots in the figures, run the make file. Trained weights for the MNIST sampler is found in the `models/weights` directory, and will be loaded automatically. To retrain the model, rename the weights and run the script `trainer.py`.

The image model uses the U-Net implementation written in PyTorch from https://github.com/lucidrains/denoising-diffusion-pytorch.

# Diffusion Model Sampler
Sampling engine for the Score Generative Diffusion Model.


## Prerequisites

Before you begin, ensure you have the following installed:
* Python (version 3.x recommended)
* PyTorch (version 2.6.0+cu124 recommended)
* SciPy
* Numpy

## Usage

Run the script from your terminal using the following command structure:

```bash
python your_script_name.py --model <model_name> --output <output_prefix> [options]
```

### Command-Line Arguments

* `--model` (str, **required**): Select the model to sample from. See "Available Models" below.
* `--output` (str, **required**): A prefix for the output files.
* `--number-of-samples` (int, optional): Specify the numer of samples per noise realization. The Markov process depends on the Gaussian noise added at each step. This option allows you to generate multiple initial conditions from the same driving stochastic process. Default: `1`.
* `--number-of-noise-realizations` (int, optional): Specify the number of different noise realizations for the sampling process. Each noise realization represents a different path sample from the driving Brownian motion.
* `--root` (str, optional): Root directory to save generated data. Default: `./samples/`.
    * Data will be saved in subdirectories:
        * Backward trajectories: `<root>/trajectory/backward/<output>.pt`
        * Lyapunov exponents: `<root>/lyap-exp/<output>.pt`
        * Lyapunov vectors: `<root>/lyap-vec/<output>.pt`
* `--T` (float, optional): The stopping time for the forward diffusion process. Default: `0.9`.
* `--n-grid` (int, optional): Number of time discretization points to use for sampling with the Euler-Maruyama method. Default: `1000`.
* `--conf-file` (str, optional): Path to a JSON file containing specific configurations for the selected model. The path should be relative to the `conf` directory (e.g., if your file is `conf/my_config.json`, provide `my_config.json`).
* `--perturb-size` (float, optional): Size of perturbation, if implemented by the model for Lyapunov exponent calculation. Default: `0.0`.
* `--no-spectrum` (flag, optional): If set, the script will **not** calculate the Lyapunov spectrum along the trajectory. Calculating the Lyapunov spectrum requires an expensive series of QR factorizations.
* `--noise-schedule` (str, optional): Specifies a custom noise schedule for the sampling process. Options are 'linear', 'cosine' and 'None'.


### Example

```bash
python3 main.py --model Singular2D --number-of-noise-realizations 3000 \
	--output singular --T 0.9 --n-grid 1000 --conf-file singular_blob.json \
	--noise-schedule cosine
```

## Available Models

The following models can be specified using the `--model` argument:

* `Singular2D` (from `models.singular_2d`)
* `TrainedImage` (from `models.trained_images`)
* `HalfMoons` (from `models.half_moons`)


## Output

The script generates and saves the following data as PyTorch tensor files (`.pt`):

1.  **Backward Trajectories**: The generated samples at each step of the backward diffusion process.
    * Saved in: `<root>/trajectory/backward/<output>.pt`
2.  **Lyapunov Exponents**: The calculated Lyapunov exponents for the trajectories.
    * Saved in: `<root>/lyap-exp/<output>.pt` (only if `--no-spectrum` is not used)
3.  **Lyapunov Vectors**: The calculated Lyapunov vectors corresponding to the exponents.
    * Saved in: `<root>/lyap-vec/<output>.pt` (only if `--no-spectrum` is not used and vectors are returned by `sample_backward`)

The `<root>` directory is `./samples/` by default and can be changed with the `--root` argument. The `<output>` is specified by the `--output` argument.

## Configuration ⚙️

Model-specific configurations can be provided using a JSON file via the `--conf-file` argument. These configuration files should be placed in a `conf/` directory relative to the script's location.

Example `conf/singular_blob.json`:
```json
{
    "means": [
        [-1.0, -0.2],
        [1.0, 0.5]
    ],
    "covs" : [
        [[0.5, 0], 
        [0, 0.5]],
        [[0.2, 0], 
        [0, 0.2]]
    ],
    "coeffs" : [
            [0.0, 0.0, 0.4, 1.0, 0.4],
            [0.0, 1.0, 0.2, 0.0, 0.1]
        ],
    "N_Integral" : 400,
    "fatten": 5e-4
}
```

For more details on each model's configuration, we please see the Object definitions.
