# Prompt Learning on Temporal Interaction Graphs
Codes for the paper "Prompt Learning on Temporal Interaction Graphs"

# Data

Please download data from the [project homepage of JODIE](https://snap.stanford.edu/jodie/) and pre-process them with the script provided by [TGN](https://github.com/twitter-research/tgn).

Please put the pre-processed datasets into a directory named as "data" under the main directory.

# How to use
## Pre-training

```
python pre_train.py --data [DATA] [--tige/--jodie/--tgn/--dyrep] --prompt_start [0.1/0.5/...] --prompt_end [0.2/0.7/...]
```
The `--prompt_start` and `--prompt_end` are used to control the data amount used for prompt learning. The default values of them are 0.5 and 0.7, which means 50% data for pre-training, 20% data for prompt tuning and 30% for validation and testing. 

Here, `--prompt_start` is used to control the data amount used for pre-training, and the `--prompt_end` is used to control the data amount left for validation and testing.

If you want to use mooc/lastfm datasets, please pass one more argument: `--dim 100`.

If you choose to use TIGER as the backbone model, please run:

```
python pre_train.py --msg_src [left/right] --upd_src [left/right] --restarter [seq/static] --restart_prob [0/0.001/...]
```

The pre-trained results for link prediction tasks could be used as baselines.

## “Pre-train, Prompt” Paradigm

### Link Prediction

```
python prompt_tune_link_prediction.py --code [CODE] --prompter_type [vanilla/transformer/projection] --prompt_end [0.2/0.7/...] 
```

Here, [CODE] is the HASH code of a pre-trained model with `pre_train.py`. For prompt-tuning, the `--prompt_start` is equal to which set for the pre-trained model, so you do not need to set it here.

Here, the `--prompt_end` should grater than the `--prompt_start` you set for pre-training. While it could be less than the `--prompt_end` set for pre-training, this means only using a small part of data for prompt-tuning.

### Node Classification

```
python prompt_tune_node_classification.py  --code [CODE] --use_valid --prompt_end [0.2/0.7/...]
```
Here, [CODE] is the HASH code of a pre-trained model with `pre_train.py`.

We suggest passing the `--use_valid`, which means the best model and TProG evaluated by the validation process will be saved.

Here, the `--prompt_end` should grater than the `--prompt_start` you set for pre-training. Also, it could be less than the `--prompt_end` set for pre-training.

## Extension: “Pre-train, Prompt-based Fine-tune” Paradigm

### Link Prediction

```
python prompt_tune_link_prediction.py --code [CODE] --prompter_type [vanilla/transformer/projection] --prompt_end [0.2/0.7/...] --fine_tune_mode
```

Here, [CODE] is the HASH code of a pre-trained model with `pre_train.py`. For fine-tuning, the `--prompt_start` is equal to the pre-trained model, so you do not need to set it here.
By adding `--fine_tune_mode`, means you want to optimize the original model with the TProG together.


### Node Classification

```
python fine_tune_node_classification.py --code [CODE] --pretrained_prompter/--optimizing_prompter --use_valid --prompt_end [0.2/0.7/...]
```
Here, [CODE] is the HASH code of a trained model with `prompt_tune_link_prediction.py`.

If you want to directly use a trained TProG (which is trained during the prompt/fine-tuning process of the link prediction task) to generate the prompts, please pass `--pretrained_prompter`.

If you want to optimize the TProG during this process, please pass `--optimizing_prompter`. This will use the trained TProG to initialize one and optimize it during the node classification fine-tuning process.

## Node Classification Baseline

```
python train_supervised.py --code [CODE] --prompt_end [0.2/0.7/...]
```
Here, [CODE] is the HASH code of a trained model with `pre_train.py`.
This is only used for recording baseline results.

