# Image Reconstruction Dataset and Usage

We will make the full dataset available, but due to its size, it is challenging to share in full anonymously. An excerpt of the dataset containing roughly 460k rows is available: download the folder "dataset excerpt" from https://osf.io/6jscn/?view_only=55141425dccf4a039e0f284480c8b0cc.

## Usage to Train a CNN Model

We demonstrate how to train the CNN architecture described in:

**Robust and Interpretable Blind Image Denoising Via Bias-Free Convolutional Neural Networks**  
Sreyas Mohan, Zahra Kadkhodaie, Eero P. Simoncelli, Carlos Fernandez-Granda  
ICLR 2020  
https://arxiv.org/abs/1906.05478

1. Download the code from https://github.com/LabForComputationalVision/bias_free_denoising/tree/64930f028aec88b0ebc1fc13f79932f5b7aa744d
2. Apply the patch `cnn.patch` from this repository to the downloaded code.
3. Add the folder reconstruction_dataset from this repository to the downloaded code.
4. Call `python train.py --in_channels=3 --data-path=<path-to-dataset>`, where <path-to-dataset> points to the local copy of the dataset.

One epoch is equal to 500 batches. There is no separation between validation and training data in this example. Therefore, the validation metrics produced during training are not reliable.

A pre-trained network is available at https://osf.io/9axsp?view_only=55141425dccf4a039e0f284480c8b0cc under the name `trained_BF-cnn.pt`. The network was trained for about 5 million images (about 40k batches of size 128). Further training for 50k more batches did not result in additional improvements.

## Usage to Train a Diffusion Model

We demonstrate how to train the CNN architecture described in:

**Elucidating the Design Space of Diffusion-Based Generative Models**  
Tero Karras, Miika Aittala, Timo Aila, Samuli Laine  
NeurIPS 2022  
https://arxiv.org/abs/2206.00364

1. Download the code from https://github.com/NVlabs/edm/tree/008a4e5316c8e3bfe61a62f874bddba254295afb
2. Apply the patch `edm.patch` from this repository to the downloaded code. (Note that the patch will break some other functionality of the code but will allow training on the image reconstruction dataset.)
3. Add the folder reconstruction_dataset from this repository to the downloaded code.
4. To train, for example, on 2 GPUs, call `torchrun --standalone --nproc_per_node=2 train.py --lr=0.001 --outdir=training-runs --data=datasets/cifar10-32x32.zip --cond=0 --arch=ddpmpp --batch-gpu=64 --fp16=1 --dump=250 --use_paired_data=1 --paired_data_path=<path-to-dataset> --in_memory_data_prep=none --num_proc_data_prep=12 --noise_proportion=0.1`, where <path-to-dataset> points to the directory containing a local copy of the dataset.

A pre-trained network is available at https://osf.io/9axsp?view_only=55141425dccf4a039e0f284480c8b0cc under the name `network-snapshot-edm-072897.pkl`.
