# CliquePH: Higher-Order Information for Graph Neural Networks through Persistent Homology on Clique Graphs

This repository contains the code for the KDD submission "CliquePH".

## Installation
```bash
conda install python==3.10
conda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.7 -c pytorch -c nvidia
conda install lightning -c conda-forge or pip install lightning
conda install pytorch-scatter pytorch-sparse pyg -c pyg
conda install -c dglteam/label/cu117 dgl
conda install -c conda-forge gudhi
conda install -c conda-forge graph-tool

cd repos/torch_persistent_homology/torch_persistent_homology
python setup.py install
```

## Training models
The repository implements two models `TopoGNN` and `GCN`.  Additional
parameters can be passed to the script depending on the model and dataset
selected. For example, the `TopoGNN` model and the `MNIST` dataset have the
following configuration options:
```bash
$ python topognn/train_model.py --model TopoGNN --dataset MNIST --help
usage: train_model.py [-h] [--model {TopoGNN,GCN}]
                      [--dataset {IMDB-BINARY,REDDIT-BINARY,REDDIT-5K,PROTEINS,PROTEINS_full,ENZYMES,DD,MUTAG,MNIST,CIFAR10,PATTERN,CLUSTER,Necklaces,Cycles,NoCycles}]
                      [--training_seed TRAINING_SEED]
                      [--max_epochs MAX_EPOCHS] [--paired PAIRED]
                      [--merged MERGED] [--logger {wandb,tensorboard}]
                      [--gpu GPU] [--hidden_dim HIDDEN_DIM] [--depth DEPTH]
                      [--lr LR] [--lr_patience LR_PATIENCE] [--min_lr MIN_LR]
                      [--dropout_p DROPOUT_P] [--GIN GIN]
                      [--train_eps TRAIN_EPS] [--batch_norm BATCH_NORM]
                      [--residual RESIDUAL] [--batch_size BATCH_SIZE]
                      [--use_node_attributes USE_NODE_ATTRIBUTES]

optional arguments:
  -h, --help show this help message and exit
  --model {TopoGNN,GCN}
  --dataset {IMDB-BINARY,REDDIT-BINARY,REDDIT-5K,PROTEINS,PROTEINS_full,ENZYMES,DD,MUTAG,MNIST,CIFAR10,PATTERN,CLUSTER,Necklaces,Cycles,NoCycles}
  --training_seed TRAINING_SEED
  --max_epochs MAX_EPOCHS
  --paired PAIRED
  --merged MERGED
  --logger {wandb,tensorboard}
  --gpu GPU
  --hidden_dim HIDDEN_DIM
  --depth DEPTH
  --lr LR
  --lr_patience LR_PATIENCE
  --min_lr MIN_LR
  --dropout_p DROPOUT_P
  --GIN GIN
  --train_eps TRAIN_EPS
  --batch_norm BATCH_NORM
  --residual RESIDUAL
  --batch_size BATCH_SIZE
  --use_node_attributes USE_NODE_ATTRIBUTES
```

### Best Configuration
Best hyperparameters used in our paper and obtained using `Bayesian Hyperparameter Optimization` can be found in `best_config` dir. The results in the paper can be reproduced by using `best_config` hyperparameters in `topognn/test.py`.

### Logging
By default runs are logged using Tensorboard, yet logging via WandB is also
possible. For this some adaptations to the code might be necessary in order to
log to the correct entity/project.  The logs of Tensorboard are by default
stored under the path `logs/{MODEL}_{DATASET}`.

