# Learning Combinatorial Node Optimization Algorithms

## Requirements
- Python >= 3.8
- PyTorch >= 1.7
- [PyTorch Geometric](https://pytorch-geometric.readthedocs.io/en/latest/index.html) >= 1.6
- NumPy
- NetworkX
- matplotlib

Optional (for logging statistics)
- Neptune (neptune.ai)

## How to train a model?

You can train a model by calling the `main.py` script. For example, to train a graph coloring model, call the following:

`python ./main.py --problem=GC --train_graph_nodes 20 40 50 70 100 --val_graph_nodes 100 --graph_types="[('S-ER',  {'er_d': 7.5}), ('WS', {'k': 5, 'p_ws': 0.1}), ('BA', {'m':4})]" --train_samples=4000 --val_samples=2000 --n_epochs=200 --seed 899 502`

In this example, the training set consists of graphs of sizes 20, 40, 50, 70, and 100. For each size, there are 4000 graphs.
The validation set consists of graphs of sizes 100, of which there are 2000.

To train a minimum vertex cover model, call the following:

`python ./main.py --problem=MVC --train_graph_nodes 20 40 50 70 100 --val_graph_nodes 170 --graph_types="[('ER',  {'p': 0.15}), ('BA', {'m':4})]" --train_samples=2500 --val_samples=2000 --n_epochs=200 --seed 899 502`


Alternatively, you can also use the functions in `trainer.trainer` directly, in which case you might want to create a `trainer.model.Attention` model.

### What should I consider when training?

- The effective batch size is equal to the `--batch_size` parameter multiplied by the number of sizes provided to `--train_graph_nodes`. 

- If you pass multiple seeds to `--seed`, these models are trained in parallel using different processes. This makes sense if you are training on a large enough CPU.


### Which synthetic graph types are supported for training?

You can train on different graphs by passing an array of tuples, where each tuple contains first the short name of the graph class and second a dictionary of parameters mapped to arguments.

- Sparse Erdos-Renyi graphs (S-ER). Parameter 'er_d' specifies the expected degree of the nodes.
- Barabasi-Albert Graphs (BA). Parameter 'm' specifies the average number of neighbors.
- Erdos-Renyi graphs (ER). Parameters 'p' specifies the probability that a given edge exists.
- Watts-Strogatz (WS). Parameter 'k' specifies the average degree, parameter 'p_ws' specifies the re-wiring probability.
- Random Geometric (RG). No parameters. Internally picks parameters close to the connectivity threshold. 
- Random Trees (Tree). No parameters.

For example, --graph_types="[('ER',  {'p': 0.15}), ('BA', {'m':4})]" means ER graphs with parameter p=0.15 and BA graphs with parameter m=4.

A list of classic fixed-structure graphs are also supported.

- Star graph (Star). No parameters.
- Cycle graph (Cycle). No parameters.
- Binomial tree graph (BinomialTree). No parameters.
 2D Grid lattice (Grid2D). Parameter 'm' specifies the width of the grid. Number of nodes must be divisible by m.
- Wheel graph (Wheel). No parameters.

For example, --graph_types="[('Cycle',  {})]".

More graph types are supported (see trainer.dataset).

## How to evaluate a model?
The [Models](Models) directory contains a collection of trained models. 
You can evalute a model under different conditions with the methods in `evaluate.py`.

In particular:
- Use `test_dimacs_files` to evaluate a model on graphs in DIMACS color challenge format. The [Instances](Instances) directory contains a few DIMACS graphs.
- Use `test_pickled_graphs` to evaluate a model on pickled networkx instances.

You can also evaluate multiple models (residing in a single directoy) in one call using `test_dimacs_files_multiple_models` and `test_pickled_graphs_multiple_models`.

## How to solve my own optimization problem?

The [problems](Problems) directory contains `problems.problem_base`, from which you need to implement `Problem` and `State`. You will also need to change `load_problem`.

