## Diffusion based Neural Operator for solving PDEs

### Environment Setup

#### Step 1: Create a Python 3.10.14 Environment

**Option A – Conda**
```bash
conda create -n physics python=3.10.14 pip -y
conda activate physics
```

**Option B – pip/venv**
```bash
python3.10 -m venv physics
source physics/bin/activate
```

---

#### Step 2: Install Core Packages

1. **Install PyTorch (with CUDA 12.1)**
```bash
pip install numpy==1.24.4
pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 \
  --index-url https://download.pytorch.org/whl/cu121
```

2. **Install scatter and harmonics (no dependency conflict)**
```bash
pip install torch-scatter==2.1.2+pt22cu121 -f https://data.pyg.org/whl/torch-2.2.1+cu121.html
pip install torch-harmonics==0.6.3 --no-deps
```
 
3. **Install remaining dependencies**
```bash
pip install -r requirements.txt
```

---

#### Step 3: Install `neuraloperator` for UNO/FNO models

```bash
git clone https://github.com/christopher-beckham/neuraloperator.git
cd neuraloperator
git checkout dev_refactor
pip install -e .
cd ..
```

---




## Experiment Training Instructions

To train a model, use the following command (adjust arguments as needed):

```bash
torchrun --standalone --nproc_per_node=<num_gpus> train.py \
  --config=<config_file.yaml> \
  --user=<username> \
  --server=<server_name> \
  --num_gpus=<num_gpus>
```

- `--config`: Path to your YAML config file (see `configs/exp_iclr/unified_guided_128/` for examples).
- `--user`: Your username (used for path configuration).
- `--server`: Server name.
- `--num_gpus`: Number of GPUs to use.

Alternatively, you can run a shell script (if available):

```bash
bash exp_scripts_v1/DiffusionPDE/darcy/test.sh <user> <server> <config_name.yaml> <num_gpus>
```

Refer to the README’s detailed argument descriptions for more options (e.g., `--outdir`, `--data`, `--batch`, `--mode`, etc.).

---

## Evaluation Instructions

To evaluate checkpoints, use the provided shell script or run the evaluation script directly. Example (from `exp_poc/eval__full.sh`):

```bash
bash exp_poc/eval__full.sh
```

Or, to run evaluation manually:

```bash
python evaluate_checkpoints.py \
  --checkpoint_dir=<path_to_checkpoint_dir> \
  --data=<path_to_data> \
  --outdir=<output_dir> \
  --test_direction=<forward|inverse> \
  --steps=<num_steps> \
  --num=<num_samples> \
  --batch=<batch_size>
```

- Adjust other options as needed (see `evaluate_checkpoints.py` for all available arguments).
- The evaluation script supports different test modes (`full`, `sparse`, `noisy`) and can save predictions, run with different sampling steps, and more.

---




 ##### Instructions to edit
 1. *nproc_per_node*: number of GPUs
 2. *outdir*: output directory to store model checkpoints/training progress
 3. *data*: input data path. Either *.mat* file or if dir then all *.mat* files inside it
 4. *dataset*: Name of the equation/dataset to accordingly setup data pipeline/residuals
 5. *mode*: Mode of trainng: forward or inverse wrt PDE
 6. *resolution*: resolution to train on. if this is not same as the native resolution of data, data would be sliced accordingly. 
 7. *duration*: training duration in terms of kimg (thousand images seen)
 8. *batch*: batch size
 9. *tick*: How often to print progress in terms of kimg. so 10 is after 10 kimg
 10. *snap*: How often to save snapshots in terms of ticks
 11. *dump*: How often to dump state
 12. *batch-gpu*:batch size per GPU, should be divisible by number of GPUs
 13. *resume*: Add --resume=<path_to_.pt> to resume traning form a checkpoint
 14. *network*: path to .pkl file to eveluate
 15. *viz_samples*: number of samples to visualize
 16. *spectral-conv*: standard or tucker for spectral convolution in FNO/UNO
 17. *validate_mode*: validate during training
 18. *validate_data*: validation data path
 19. *offset*: offset to load data from
 20. *num*: number of samples to load from data file

