# GraphLog Data Generator

Script to generate data for `GraphLog`. The following script generates the data used for our ICML submission:

```
python prepare_data.py --num_rel_choices "10" --num_splits 100  --graphs_per_world 5000 --num_worlds 100 --per_inverse_choices "0.5" --corrupt_eps_choices "0" --expand_steps_choices "5" --uniform_prob --fix_num_relations --policy overlap --folder ~/checkpoint/lgw/data/comp_r10_n100_ov --save_path ~/checkpoint/lgw/dat/comp_r10_n100_ov --num_nodes 5000 --sample_graphs --num_train_rows 5000 --num_valid_rows 1000 --num_test_rows 1000 --world_train_val_test_split 0.95
```

The data used for the paper can [also be downloaded here](https://drive.google.com/file/d/1nsVr-CXYouzrdiQUgqSbQLcfQJduTJyg/view?usp=sharing).

## Usage

```
usage: prepare_data.py [-h] [--num_rel NUM_REL] [--per_inverse PER_INVERSE]
                       [--corrupt_eps CORRUPT_EPS] [--rule_name RULE_NAME]
                       [--uniform_prob] [--fix_num_relations]
                       [--policy POLICY] [--num_splits NUM_SPLITS]
                       [--num_nodes NUM_NODES] [--expand_steps EXPAND_STEPS]
                       [--add_noise] [--save_path SAVE_PATH] [--sanity]
                       [--bidirectional] [--gen_graph_cyles GEN_GRAPH_CYLES]
                       [--randomize_steps]
                       [--world_graph_expand_steps WORLD_GRAPH_EXPAND_STEPS]
                       [--world_graph_per_edges WORLD_GRAPH_PER_EDGES]
                       [--sample_worlds] [--sample_graphs]
                       [--num_train_rows NUM_TRAIN_ROWS]
                       [--num_valid_rows NUM_VALID_ROWS]
                       [--num_test_rows NUM_TEST_ROWS] [--easy_mode]
                       [--path_cutoff PATH_CUTOFF] [--num_worlds NUM_WORLDS]
                       [--graphs_per_world GRAPHS_PER_WORLD]
                       [--train_test_split TRAIN_TEST_SPLIT]
                       [--train_val_split TRAIN_VAL_SPLIT]
                       [--world_train_val_test_split WORLD_TRAIN_VAL_TEST_SPLIT]
                       [--num_rel_choices NUM_REL_CHOICES]
                       [--per_inverse_choices PER_INVERSE_CHOICES]
                       [--corrupt_eps_choices CORRUPT_EPS_CHOICES]
                       [--expand_steps_choices EXPAND_STEPS_CHOICES]
                       [--folder_name FOLDER_NAME] [--load_rule LOAD_RULE]
                       [--config_path CONFIG_PATH] [--eval_k_shot EVAL_K_SHOT]
                       [--eval_k_epoch EVAL_K_EPOCH] [--output OUTPUT]
                       [--config_toggle_true CONFIG_TOGGLE_TRUE]
                       [--eval_data_mode EVAL_DATA_MODE]
                       [--eval_load_epoch EVAL_LOAD_EPOCH]
                       [--eval_rules EVAL_RULES]
                       [--eval_data_folder EVAL_DATA_FOLDER]
                       [--eval_store_rep]

optional arguments:
  -h, --help            show this help message and exit
  --num_rel NUM_REL     number of relations for the current rule
  --per_inverse PER_INVERSE
                        percentage of inverse relations
  --corrupt_eps CORRUPT_EPS
                        corruption epsilon greedy percentage for each rule
  --rule_name RULE_NAME
                        placeholder for rule name
  --uniform_prob        make all rules with uniform probability
  --fix_num_relations   if true, then create the first rule world, and
                        subsequently only allow the heads of the first rule
                        world in the rest
  --policy POLICY       random/flip/sanity/fsrl_1
  --num_splits NUM_SPLITS
                        number of splits for the current config
  --num_nodes NUM_NODES
                        number of nodes
  --expand_steps EXPAND_STEPS
                        max number of steps expansion takes place
  --add_noise           add noise
  --save_path SAVE_PATH
                        save path (to be prepended by $HOME/mlp)
  --sanity              if true, generate graphs from the same rule world
  --bidirectional       if true, then the main resolution path contain
                        bidirectional edges for the no-noise case
  --gen_graph_cyles GEN_GRAPH_CYLES
                        cycles in the to generate in the big graph
  --randomize_steps
  --world_graph_expand_steps WORLD_GRAPH_EXPAND_STEPS
                        max world graph expansions
  --world_graph_per_edges WORLD_GRAPH_PER_EDGES
                        world graph percentage edges
  --sample_worlds       if true, sample the worlds
  --sample_graphs       if true, sample the graphs
  --num_train_rows NUM_TRAIN_ROWS
                        number of train
  --num_valid_rows NUM_VALID_ROWS
                        number of valid
  --num_test_rows NUM_TEST_ROWS
                        number of test
  --easy_mode           draw test graphs from train distribution
  --path_cutoff PATH_CUTOFF
                        max length of the resolution path
  --num_worlds NUM_WORLDS
                        number of worlds
  --graphs_per_world GRAPHS_PER_WORLD
                        number of graphs per world
  --train_test_split TRAIN_TEST_SPLIT
                        train test split
  --train_val_split TRAIN_VAL_SPLIT
                        train val split
  --world_train_val_test_split WORLD_TRAIN_VAL_TEST_SPLIT
                        train val test split for worlds
  --num_rel_choices NUM_REL_CHOICES
                        number of relations per world
  --per_inverse_choices PER_INVERSE_CHOICES
                        comma separated `per_world`
  --corrupt_eps_choices CORRUPT_EPS_CHOICES
                        comma separated `corrupt_eps`
  --expand_steps_choices EXPAND_STEPS_CHOICES
                        Expand step choices
  --folder_name FOLDER_NAME
  --load_rule LOAD_RULE
                        load rule world for sanity testing
  --config_path CONFIG_PATH
  --eval_k_shot EVAL_K_SHOT
  --eval_k_epoch EVAL_K_EPOCH
  --output OUTPUT
  --config_toggle_true CONFIG_TOGGLE_TRUE
                        comma separated flag to toggle true in model
  --eval_data_mode EVAL_DATA_MODE
  --eval_load_epoch EVAL_LOAD_EPOCH
  --eval_rules EVAL_RULES
                        comma separated
  --eval_data_folder EVAL_DATA_FOLDER
  --eval_store_rep
  ```