# Variational Potential Flow (VAPO)

Energy based models (EBMs) are appealing for their generality and simplicity in data likelihood modeling, but have conventionally been difficult to train due to unstable and time-consuming intermediate MCMC sampling during contrastive divergence training.
In this paper, we present a novel energy-based generative framework, VAPO, that entirely dispenses with intermediate MCMC sampling and complementary models. The VAPO framework aims to learn a potential energy function whose gradient (flow) guides the prior samples, so that their density evolution closely follows an approximate data likelihood homotopy. A variational loss objective is then derived to minimize the Kullback-Leibler divergence between density evolution of the flow-driven prior and the data likelihood homotopy. 
Synthetic images can be generated after training the potential energy, by initializing the samples from Gaussian prior and solving the ODE governing the potential flow on a fixed time interval using generic ODE solvers. Experiment results show that the proposed VAPO framework is capable of generating high fidelity images on various image datasets. In particular, our proposed framework achieves competitive FID scores for unconditional image generation on the CIFAR-10 and CelebA datasets.

---

*Acknowledgement:* Our implementation relies on the repo https://github.com/Newbeeer/Poisson_flow. 

## Dependencies

The necessary python (Python 3.9.12, CUDA Version 11.6) dependency for our code can be installed as follows:

```sh
pip install -r requirements.txt
```

## Usage

Train and evaluate our models through `main.py`.

```sh
python3 main.py:
  --config: Training configuration.
  --eval_folder: The folder name for storing evaluation results (default: 'eval')
  --mode: <train|eval>: Running mode: train or eval
  --workdir: Working directory
```

For example, to train a new PFGM w/ DDPM++ model on CIFAR-10 dataset, one could execute 

```sh
python3 main.py --config ./configs/homotopy/cifar10.py --mode train --workdir homotopy_cifar10
```

* `config` is the path to the config file. The prescribed config files are provided in `configs/`.

**Naming conventions of config files**: the path of a config file is a combination of the following dimensions:

- method: **VAPO**: `homotopy`
* dataset: One of `cifar10`, `celeba64`

**Important Note** : We use a relatively large batch (`training.batch_size=256` for CIFAR-10, ~19 GB GPU memory usage; `training.batch_size=128` for CelebA $64^2$, ~31 GB GPU memory usage) for training. To adjust GPU memory cost, please modify the `training.batch_size` parameter in the config files. 


*  `workdir` is the path that stores all artifacts of one experiment, like checkpoints, samples, and evaluation results.

* `eval_folder` is the name of a subfolder in `workdir` that stores all artifacts of the evaluation process, like meta checkpoints for pre-emption prevention, image samples, and numpy dumps of quantitative results.

* `mode` is either "train" or "eval". When set to "train", it starts the training of a new model, or resumes the training of an old model if its meta-checkpoints (for resuming running after pre-emption in a cloud environment) exist in `workdir/checkpoints-meta` .

* Below are the list of evalutation command-line flags:

`--config.eval.enable_sampling`: Generate samples and evaluate sample quality, measured by FID and Inception score. 

`--config.eval.dataset=train/test` : Indicate whether to compute the likelihoods on the training or test dataset.

`--config.eval.enable_interpolate` : Image Interpolation
  

## Checkpoints

Please place the pretrained checkpoints under the directory `workdir/checkpoints`, e.g., `homotopy_cifar10/checkpoints`.

To generate and evaluate the FID/IS of  (10k) samples of the PFGM w/ DDPM++ model, you could execute:

```shell
python3 main.py --config ./configs/homotopy/cifar10.py --mode eval --workdir homotopy_cifar10 --config.eval.enable_sampling --config.eval.num_samples 10000
```

To only generate and visualize 100 samples of the PFGM w/ DDPM++ model, you could execute:

```shell
python3 main.py --config ./configs/homotopy/cifar10.py --mode eval --workdir homotopy_cifar10 --config.eval.enable_sampling --config.eval.save_images --config.eval.batch_size 100
```

The samples will be saved to `homotopy_cifar10/eval/`.

Pre-trainend model checkpoints (`homotopy_cifar10` for CIFAR-10 and `homotopy_celeba` for CelebA $64^2$) are provided in this [Google drive folder](https://drive.google.com/drive/folders/1dNNK930E5TqJnOzBE6e-JmXo9BJe76MM?usp=sharing).

| Dataset              | Checkpoint path                                              | NFE (RK45 Sampling) | GPU Usage (Training) |
| -------------------- | :----------------------------------------------------------- | :---: | :---: |
| CIFAR-10             | [`homotopy_cifar10/checkpoints`] | ~146 | ~19GB |
| CelebA $64^2$        | [`homotopy_celeba/checkpoints`] | ~278 | ~31GB |

### FID statistics

Please find the pre-computed statistics files for FID scores in the following links and place them under the directory `assets/stats`:

[CIFAR-10](https://drive.google.com/file/d/1YyympxZ95l6_ane0TxYt94yqeiGcOBNG/view?usp=sharing),  [CelebA 64](https://drive.google.com/file/d/1dzSsmBvJOjDy12VzdypWDVYBF8b9yRkm/view?usp=sharing)
