# GOAT
**G**raph **O**ptimization via **A**ugmented **T**ransformations (GOAT)


## Abstract

To address out-of-distribution in graph struture data in test time, we propose a novel self-supervised framework called **GOAT** (**G**raph **O**ptimization via **A**ugmented **T**ransformations). By focusing on test-time adaptation and introducing an adapter that operates **solely on node features**, **GOAT** efficiently adapts to distribution shifts without the need to modify the pre-trained model's parameters, design specialized pre-training methods, or access any information from the data used during pre-training.

![GOAT_structure.png](GOAT_structure.png)


## Requirements
We used Python 3.10.8. For the Python packages, please see [requirements.txt]().

```
deeprobust==0.2.7
dgl==1.1.2+cu118
gensim 
googledrivedownloader==0.4
ipdb==0.13.7
matplotlib==3.8.3
networkx==2.8.8
node2vec==0.4.6
normflows==1.7.3
numba==0.59.0
numpy==1.23.5
ogb==1.3.6
pandas==2.2.1
prompt-toolkit==3.0.43
protobuf==3.20.0
PyGCL==0.1.2
scipy==1.11.4
seaborn==0.13.2
threadpoolctl==3.1.0
torch==2.0.0
torch_geometric==2.0.1
torch-cluster==1.6.3+pt20cu118
torch-scatter==2.1.2+pt20cu118
torch-sparse==0.6.18+pt20cu118
torch-spline-conv==1.2.2+pt20cu118
tqdm
visualization==1.0.0
```

## Download Datasets
We used the datasets provided by Wu et al.. We slightly modified their code to support data loading and put the code in the `GraphOOD-EERM` folder. 

You can make a directory `./GraphOOD-EERM/data` and download all the datasets.

Make sure the data files are in the `./GraphOOD-EERM/data` folder:
```
project
│   README.md
│   train_both_all.py
│   script.sh
|   ...
|
└───GraphOOD-EERM
│   └───data
│       │   Amazon
│       │   elliptic
│       │   ...
│   
└───robustness
```
## Note
We note that the GCN used in the experiments of EERM does not normalize the adjacency matrix according to its open-source code. Here we normalize the adjacency matrix to make it consistent with the original GCN.

## Run our code
Simply run the following command to get started.
```
python main.py --gpu_id=0 --dataset=cora --model=GCN  --seed=0 --tune=0 --mlp_prompt=True --prompt=True --LR=True
python main.py --gpu_id=0 --dataset=ogb-arxiv --model=GPR  --seed=0 --tune=0 --mlp_prompt=True --prompt=True --LR=True
python main.py --gpu_id=0 --dataset=elliptic --model=SAGE  --seed=0 --tune=0 --mlp_prompt=True --prompt=True --LR=True
python main.py --gpu_id=0 --dataset=cora --model=SAGE  --seed=0 --tune=0 --mlp_prompt=True --prompt=True --LR=True
python main.py --gpu_id=0 --dataset=ogb-arxiv --model=SAGE  --seed=0 --tune=0 --mlp_prompt=True --prompt=True --LR=True
python main.py --gpu_id=0 --dataset=cora --model=GAT  --seed=0 --tune=0 --mlp_prompt=True --prompt=True --LR=True
```
where `tune=0` indicates that we are using fixed hyper-parameters provided by us.

You can also run the following script.
```
mkdir saved
bash script.sh 
```
You can also try different losses for test-time adapter LROG:
```
python main.py --gpu_id=0 --dataset=cora --model=GCN  --tune=0 --seed=0 --debug=1 --loss=["La2a", "Ls", "Lc", "Lr"]
python main.py --gpu_id=0 --dataset=cora --model=GCN  --tune=0 --seed=0 --debug=1 --loss="entropy"
python main.py --gpu_id=0 --dataset=cora --model=GCN  --tune=0 --seed=0 --debug=1 --loss="gtrans"
```
Note that ["La2a", "Ls", "Lc", "Lr"] is the desinged point estimation optimization target in two sampled views.

## Hyper-parameter tuning suggestion
Test-time graph transformation requires careful tuning skills. A general suggestion would be to choose small learning rate and small training epochs. If you are using GOAT for other datasets, please first tune the hyperparameters based on the validation set, i.e., `--test_val=1 --tune=1`.
