# Curve Your Attention: Mixed-Curvature Transformers for Graph Representation Learning

This repository contains our implementation of Fully Product-Stereographic Transformer (FPS-T), based on the [HGCN](https://github.com/HazyResearch/hgcn) and [QGCN](https://github.com/xiongbo010/QGCN) repositories. 
```
The encoders/ folder contains the implementation of FPS-T and TokenGT.
```

## Training and Evaluation
```
python train.py
Arguments:
  --act ACT             which activation function to use (or None for no
                        activation)
  --bias BIAS           whether to use bias (1) or not (0)
  --c C                 initial curvature, set to None for trainable curvature
  --cuda CUDA           which cuda device to use (-1 for cpu training)
  --curvature-lr C_LR   learning rate for curvature parameter
  --dataset DATASET     which dataset to use
  --dim DIM             embedding dimension
  --dropout DROPOUT     dropout probability
  --epochs EPOCHS       maximum number of epochs to train for
  --head-dim H_DIM      dimension within attention head
  --hops HOPS           number of hops for input feature mixing
  --lap-dropout L_D     dropout on laplacian eigenvectors
  --lap-k L_K           number of eigenvectors to use for positional encoding
  --lap-sign-flip L_F   whether to use sign flips on laplacian eigenvectors
  --log-freq LOG_FREQ   how often to log results
  --lr LR               learning rate
  --model MODEL         which encoder to use [TokenGT, FPST]
  --normalize-feats NORMALIZE_FEATS
                        whether to normalize input node features
  --num-heads N_HEADS     number of attention heads for graph attention
                        networks, must be a divisor dim
  --num-layers NUM_LAYERS
                        number of hidden layers in encoder
  --seed SEED           seed for training
  --task TASK           which tasks to train on, can be any of [md, nc]
  --weight-decay WEIGHT_DECAY
                        l2 regularization strength
  --patience PATIENCE   patience for early stopping
```

# Examples

Running FPS-T for graph reconstruction on the Web-Edu network
```
python train.py --act=ReLU --attn-dropout=0 --bias=1 --c=None --cuda=0 --curvature-lr=0.01 --dataset=web-edu --dim=16 --dropout=0 --epoch=10000 --head-dim=16 --hops=0 --lap-dropout=0 --lap-k=32 --lap-sign-flip=False --layer-norm=0 --log-freq=5 --lr=0.01 --model=FPST --normalize-feats=0 --num-heads=2 --num-layers=1 --patience=10000 --seed=1 --task=md --weight-decay=0
```

Running FPS-T for node classification on the Actor network
```
python train.py --act=Sigmoid --bias=1 --c=None --cuda=0 --curvature-lr=0.0001 --dataset=actor --dim=16 --dropout=0.5 --epoch=1000 --head-dim=16 --hops=0 --lap-dropout=0 --lap-k=32 --lap-sign-flip=False --log-freq=5 --lr=0.01 --model=FPST --normalize-feats=0 --num-heads=1 --num-layers=3 --seed=0 --task=nc --weight-decay=0 --patience=1000
```
