## Data
The data can be downloaded from an anonymized [gdrive](https://drive.google.com/file/d/10ILD5pB1vuP7kW-Tvnc0Xm-uddpZJO-s/view?usp=sharing). 
The graphs stem from the [SNAP](https://snap.stanford.edu/data/) repository. The format of graphs is a weighted edgelist (with weighted cascade weights) in .inf and are accompanied by an attribute file such that one can easily utilize the open code for [IMM](https://github.com/snowgy/Influence_Maximization/wiki/Home/) to reproduce the benchmark along with the results.
Unzip the data.zip in a "data" folder in the current folder.


## Requirements
To run this code you will need the following in python3.5.2:
* [pytorch 1.5.1](https://pytorch.org/)
* [networkx 1.11](https://networkx.github.io/) 
* [sklearn](https://scikit-learn.org/stable/) 
* [numpy](https://www.numpy.org/)
* [pandas](https://pandas.pydata.org/)
* [scipy](https://www.scipy.org/)


## Code
The following scripts use the default parameters mentioned in the paper.

1.Influence estimation using the stored model 
```bash
python influence_predictions.py
```

2.Influence maximization (20 and 100 seeds) using the stored model with Celf-glie and evaluation of the seeds. Note that evaluation can take more then 3 hours for the large datasets.

```bash
python celf_glie.py
```

3.Influence maximization using the stored GNN model and the stored Grim model.

```bash
python grim.py

```

4.Influence maximization using the stored GNN model and the Pun model.

```bash
python pun.py
```


5.Train GNN on the negative samples, using the provided "influence_train_set.csv" constructed as discribed in section 4.1 of the paper.

```bash
python glie_train.py
```

6.Train Grim on the 50 graphs in "dql_graphs" as described in section 4.2 of the paper.

```bash
python train_dqn.py
```

7.The scripts in the preprocessing folder are required to create the "influence_train_set.csv". 





