# SwinGNN: Rethinking Permutation Invariance in Diffusion Models for Graph Generation
Code for TMLR submission

This repository contains the official implementation of the SwinGNN model in PyTorch.


## Sampling processes of learned models on grid and protein datasets
![](asset/animation_grid_slow_v1.gif)

![](asset/animation_dd_protein_slow_v1.gif)

## Get started
### Install dependencies
```bash
# option 1: python 3.8 venv
python -m venv venvscorenet
source venvscorenet/bin/activate
pip install -U pip
pip install cython==0.29.32 
pip install pomegranate 
pip install -r setup/requirements.txt

# option 2: conda
conda create -n scorenet python=3.8 
conda activate scorenet 
pip install -U pip 
pip install cython==0.29.32 
pip install pomegranate 
pip install -r setup/requirements.txt

# compile ORCA for orbit statistics evaluation
export PROJ_DIR=$(pwd)
cd evaluation/orca && g++ -O2 -std=c++11 -o orca orca.cpp && cd $PROJ_DIR
```
Please be aware that it may be crucial to first do `pip install cython==0.29.32` and `pip install pomegranate` to prevent potential issues when installing the `molsets` (`moses`) package, which is necessary for calculating molecule metrics. Unfortunately, the original `molsets` (`moses`) package at https://github.com/molecularsets/moses is no longer actively maintained. 
It's important to install these dependencies in the correct order to avoid potential installation errors.

### Setup datasets
```bash
# prepare datasets
python setup/gen_graph_data.py  # prepare various synthetic and real-world graph datasets
python setup/mol_preprocess.py --dataset ZINC250k  # prepare ZINC250k dataset
python setup/mol_preprocess.py --dataset QM9  # prepare QM9 dataset
```


## Training command
Below we provide the training commands for SwinGNN on graph datasets and molecule datasets.
Please refer to `config/edm_swin_gnn` for more training configurations.
```bash
# training cmds on graph dataset (without node/edge attributes), e.g., to train on grid dataset
python train.py -c config/edm_swin_gnn/grid_edm_swin_gnn_80.yaml --batch_size 10 -m=grid

# our code also supports DDP training
export NUM_GPUS=4
torchrun --nproc_per_node=$NUM_GPUS train.py -c config/edm_swin_gnn/grid_edm_swin_gnn_80.yaml --batch_size 40 --ddp -m=grid_ddp

# training cmds on molecule dataset (with node/edge attributes), e.g., to train on QM9 dataset
torchrun --nproc_per_node=$NUM_GPUS train.py -c config/edm_swin_gnn/qm9_edm_swin_gnn.yaml --feature_dims 60 --node_encoding one_hot --edge_encoding one_hot --batch_size 10240 --ddp -m qm9
```
