# Graph convolutional DKMs

## Prerequisites

This repository implements deep kernel machines for graphs. To run the code, you need to install the following:

```
pytorch
numpy
pandas
matplotlib
torch_geometric
```

To recreate the environment used for running experiments, install from `requirements.txt` (the python version used was 3.11.9).

## Examples

To run a simple example on cora, try `make cora-example`.

To train models on different datasets/with different parameters, use the train script directly. For example:

```bash
## train a DKM with dof=1. and 100 inducing points on squirrel
python ./train_gcdkm.py --dataset squirrel --dof 1. --Pi 100 --split test-0-0

## train a sparse NNGP on citeseer with 100 inducing points
python ./train_gcdkm.py --dataset citeseer --dof inf --Pi 100 --split public
```

You should be able to train simple models on all the datasets with a 3090. For our experiments (involving more inducing points/different architectures) we used an A100 for the two largest datasets (arxiv and reddit).

## Recreating artefacts from the paper

To recreate the artefacts (plots and tables) from the paper, use the `Makefile`:

```bash
make final-table # prints out latex

# to recreate plots
make val-acc-dof-plot
make shaped
make linear-gcdkm ## quicker with a gpu
```

## Rerunning experiments entirely

If you want to rerun the experiments from scratch, delete all the pickle files (`rm -f **/*.pkl`) and run:

```
bash scripts/run_gcn.sh ## for GCN results
bash scripts/run_dkm.sh ## for DKM results
```

This will take a fairly long time! Additionally, as mentioned above, you will need an A100 for reddit and arxiv.