# Source Code: Turbulent Flow Simulation using Autoregressive Conditional Diffusion Models
This is the source code for the paper Turbulent Flow Simulation using Autoregressive Conditional Diffusion Models. All copyrights belong to the authors.


-----------------------------------------------------------------------------------------------------

## Installation
In the following, Linux is assumed as the OS but the installation on Windows should be similar.

We recommend to install the required python packages (see `requirements.yml`) via a conda environment (e.g. using [miniconda](https://docs.conda.io/en/latest/miniconda.html)), but it may be possible to install them with *pip* (e.g. via *venv* for a separate environment) as well.
```shell
conda env create -f requirements.yml
conda activate ACDM
```
In the following, all commands should be run from the root directory of this source code. Running the training or sampling code in the `src` directory requires the generation of data sets as described in the following.


## Basic Usage
Once the data sets are generated, models can be trained using the scripts `src/training_*.py`, trained models can be sampled with `src/sample_models_*.py`, and model predictions can be evaluated and visualized with `src/plot_*.py`. Each script contains various configuration options at the beginning of the file. All files should be run according to the following pattern:
```shell
python src/training_*.py
python src/sample_models_*.py
python src/plot_*.py
```

## Directory Structure
The directory `src/turbpred` contains the general code base that the training and sampling scripts rely on. The `src/lsim` directory contains the LSIM metric from Kohl et al. that is used for evaluations. The `data` directory contains data generation scripts, and downloaded or generated data sets should end up there as well. The `runs` directory contains the trained models which are loaded by the sampling scripts, as well as further checkpoints and log files. The `results` directory contains the results from the sampling, evaluation, and plotting scripts. Sampled model predictions are written to this directory as compressed numpy arrays, that are read by the plotting scripts, which in turn write the resulting plots to the same directory.


## Training Monitoring with Tensorboard
During training, various values, statistics and plots are logged to Tensorboard, allowing for monitoring the training progress. To start Tensorboard, use the following command:
```shell
tensorboard --logdir=runs --port=6006
```
and open http://localhost:6006/ in your browser to inspect the logged data.



-----------------------------------------------------------------------------------------------------

## Data Generation and Processing

### Downloading our Data
Our simulated data sets will be made available to download upon publication. Thus, all data sets have to be generated locally for now as described below.

### Generation with PhiFlow: Incompressible Wake Flow (*Inc*)
<details>
<summary>Click to expand detailed PhiFlow instructions</summary>

To generate data with the fluid solver PhiFlow, perform the following steps:
1. Download the [PhiFlow source code](https://github.com/tum-pbs/PhiFlow) and follow the [installation instructions](https://tum-pbs.github.io/PhiFlow/Installation_Instructions.html). We use the PyTorch backend, that should work out of the box with a correction installation of PhiFlow. **Our scripts assume the usage of version 2.0.3 at commit [abc82af2](https://github.com/tum-pbs/PhiFlow/tree/abc82af247a4f9de49a0d246277a421739eed7c1)! Substantially newer versions might not work.**
2. Ensure that the packages *numpy*, *matplotlib*, and *imageio* are installed in the python environment used for PhiFlow.
3. Add our data generation scripts that handle the solver setup and data export to the PhiFlow installation by copying all files from the `data/generation_scripts/PhiFlow` directory to the `demos` directory in your PhiFlow directory.
4. The copied files contain the PhiFlow scene for the *Inc* training and test data set (.py files), that can be run in the same way as the other example PhiFlow scene files in the `demos` directory. The corresponding batch generation scripts (.sh files) simply run the scene multiple times with different parameters to build the full data set.
5. Adjust paths and settings in the python generation file if necessary, and run it or alternatively the batch script to generate the data.
6. Copy or move the generated data set directory to the `data` directory of this source code for training. Make sure to follow the data set structure described below.
</details>


### Generation with SU2: Transonic Cylinder Flow (*Tra*)
<details>
<summary>Click to expand detailed SU2 instructions</summary>

To generate data with the fluid solver SU2, perform the following steps:
1. Follow the general [SU2 installation instructions](https://su2code.github.io/docs_v7/SU2-Linux-MacOS/) with the Python Modules. **We ran the generation on SU2 version 7.3.1 (Blackbird)! Substantially newer versions might not work.** Make sure that you have access to an MPI implementation on your system as well, since our generation script runs SU2 via the `mpiexec` command. Try running the provided [SU2 test cases](https://su2code.github.io/docs_v7/Test-Cases/) to ensure the installation was successful.
2. Ensure that the packages *numpy*, *matplotlib*, and *scipy* are installed in the python environment used for SU2.
3. Add our data generation scripts that handle the solver setup and data export to the SU2 installation by copying all files from the `data/generation_scripts/SU2` directory to a new directory in the root of your SU2 installation (for example called `SU2_raw`). These include the main generation script `data_generation.py`, the python helper file `convert_data.py` to convert the data to compressed numpy arrays, as well as the mesh file `grid_quad_2d.su2` for all simulations. Furthermore, the SU2 configuration files for the three consecutive simulations (1. steady simulation as initialization: `steady.cfg`, 2. unsteady warmup: `unsteady_2d_initial.cfg`, 3. actual simulation for data generation: `unsteady_2d_lowDissipation.cfg`) run by the python script are included.
4. Adjust paths and settings in the python generation file if necessary, and run it with the following command to generate the data (from the root directory of the SU2 installation):
```shell
python SU2_raw/data_generation.py [Thread count] [Reynolds number] [List of Mach numbers] [List of corresponding simulation folder IDs] [Restart iteration]
```
For example, to create three simulations at Mach numbers 0.6, 0.7, and 0.8 with Reynolds number 10000 using 112 threads, run the following command:
```shell
python SU2_raw/data_generation.py 112 10000 0.60,0.70,0.80 0,1,2 -1
```
5. Copy or move the generated data set directory to the `data` directory of this source code for training.
6. Post-process the data set directory structure with the `src/convert_SU2_structure.py` script (adjust script settings if necessary), that also extracts some information from the auxiliary simulation files. Make sure that the converted data directory follows the data set structure described below.
</details>


### Download from the Johns Hopkins Turbulence Database: Isotropic Turbulence (*Iso*)
<details>
<summary>Click to expand detailed JHTDB instructions</summary>

To extract sequences from the [Johns Hopkins Turbulence Database](http://turbulence.pha.jhu.edu/), the required steps are:
1. Install the [pyJHTDB package](https://github.com/idies/pyJHTDB) for local usage, and make sure *numpy* is available.
2. Request an [authorization token](http://turbulence.pha.jhu.edu/authtoken.aspx) to ensure access to the full data base, and add it to the script `data/generation_scripts/JHTDB/get_JHTDB.py`
3. Adjust the paths and settings in the script file if necessary, and run the script to download and convert the corresponding regions of the DNS data. The script `data/generation_scripts/JHTDB/get_JHTDB_scheduler.py` can be run instead as well. It reconnects to the data base automatically in case the connection is unstable or is otherwise interrupted, and resumes the download.
4. Copy or move the downloaded data set directory to the `data` directory of this source code for training if necessary. Make sure to follow the data set structure described below.
</details>




### Data Set Structure
Ensure that the data set folder structure resulting from the data generation is the following to ensure the data set can be loaded correctly: `data/[datasetName]/sim_[simNr]/[field]_[timestep].npz`. Here datasetName is any string, but has to be adjusted accordingly when creating data set objects. The simulation folder numbers should be integers with a fixed width of six digits and increae continuously. Similarly, the timestep numbering should consist of integers with a fixed width of six digits and increase continuously. Start and end points for both can be configured when creating Dataset objects. Fields should be strings that describe the physical quantity, such as pressure, density, or velocity. Velocity components are typically stored in a single array, apart from the *Iso* case where the velocity z-component is stored separately as velocityZ. For example, a density snapshot at timestep zero from the *Tra* data set is referenced as `data/128_tra/sim_000000/density_000000.npz`.


### General Data Post-Processing
`src/copy_data_lowres.py` can be used to downsample the generation resolution of `256x128` to the training and evaluation resolution of `128x64` for the simulated data sets. It processes all .npz data files, while creating copies of all supplementary files in the input directory. Computing mean and standard deviation statistics for the data normalization to a standard normal distribution is performed via `src/compute_data_mean_std.py`.



