Summary
-------
This repository contains code and utilities to evaluate algorithms that convert probabilistic edge predictions into TSP 
tours. The focus is on comparing decoding strategies (greedy variants, $\text{CHR}^+$, beam search, 2-opt local improvement) 
when predictions are provided as edge-level probabilities.

Contents
--------
- Python scripts:
  - `dataset_U.py` - Run experiments on synthetic uniform TSP instances (dataset $\mathcal{U}$) and produce 
  output/results_U.csv`.
  - `dataset_T.py` - Evaluate decoding strategies on TSPLIB instances (dataset $\mathcal{T}$) and produce 
`output/results_T.csv`.
  - `beam_search.py` - Run beam-search-based decoding experiments and/or generate summary tables (output CSV).
  - `2_opt_improvement.py` - Evaluate the effect of applying 2-opt improvement to predicted tours.
  - `compare_methods_for_generating_P.py` - Compare different ways of converting heatmaps into predictions 
  used by $\text{CHR}^+$ (Alg1, Alg1Top, $\text{CHR}^+$).
  - `chrp.py` - Implementation of $\text{CHR}^+$ (Christofides augmented with predictions).
  - `utils.py` - Collection of helper functions used across scripts
  - `difusco_focus.py` - Script to reproduce Figure 8.
  - `smoothness_with_epsilon_alps_greedy.py`, Script to reproduce Figure 7 and Figure 16.

- `data/` - Contains datasets used by the experiments and a `generate_data.py` helper to create instance pickles.
  - `data/tsp_uniform/` - Synthetic instance pickles named like `<n>_<seed>.pkl`.
  - `data/tsplib/` - TSPLIB instances converted to the project's pickle format.
  - `data/README.md` - Instructions on generating and formatting data.
  - `data/concorde.py` - Wrapper around Concorde TSP solver for optimal tour computation.

- `output/` - Experiment outputs (CSV, pickles, figures).
- `figures/` - Plots generated by the analysis scripts.
- `environment.yml` - Conda environment specification with the main dependencies.

Quick start / Requirements
-------------------------
This project was developed with Python 3.13 (see `environment.yml`). 
Create a conda environment:

```bash
conda env create -f environment.yml
conda activate tour_from_neural_predictions
```

Notes about the data format
-------------------------
All TSP instances are stored as NetworkX graphs pickled to disk. Read `./data/README.md` for details on generating these pickles.

Main scripts and examples
-------------------------
All scripts are intended to be run from the repository root. Examples below assume you are in the `tour_from_neural_predictions/` folder.

1) Create two folders: `output/` and `figures/` to store experiment results and generated plots.
```bash
mkdir "output"
mkdir "figures"
```

2) Run dataset experiments on synthetic instances ($\mathcal{U}$):

```bash
python dataset_U.py
```
3) Evaluate on TSPLIB instances ($\mathcal{T}$):

```bash
python dataset_T.py
```

4) Beam-search experiments and table generation:

```bash
python beam_search.py
# The script has a `run` flag; when `run = False` it reads a precomputed CSV and prints LaTeX-ready tables.
```

5) 2-opt improvements analysis:

```bash
python 2_opt_improvement.py
```

This will evaluate the effect of running the `two_opt` local search on tours produced by different decoding strategies 
and save results in `output/`.

5) Compare methods for generating discrete predictions P:

```bash
python compare_methods_for_generating_P.py
```

This script compares `Alg1`, `Alg1Top`, $\text{CHR}^+$ strategies to convert probability outputs into discrete edge 
predictions used by $\text{CHR}^+$, producing a figure under `figures/` and optionally caching results under `output/`.

Important implementation details and conventions
-----------------------------------------------
- Decoding strategies in the codebase:
  - G1: greedy nearest-neighbor guided by predicted probabilities (`greedy_with_probabilities_nearest_neighbor`).
  - G2: greedy by edges (`greedy_with_probabilities_edge`) — builds a tour by adding high-probability edges while avoiding premature cycles.
  - ALPS (⭐) / $\text{CHR}^+$: `chrp.py` modifies edge weights using predictions and runs the Christofides heuristic.
  - BS: beam search over nodes using log-prob scores (`beam_search` in `utils.py`). 
The final selection is the lowest-weight complete tour among beam candidates.

- Many scripts write CSV files into `output/`. Some scripts include a `run` boolean to toggle between actually running 
- long experiments and creating tables from previously computed CSVs. Check the top of the script to change behavior.

(⭐) The original name of our algorithm was ALPS (*AL*gorithm with *P*rediction*S*). Eventually, we adopted the name 
$\text{CHR}^+$ to enhance the relation to Christofides. We did our best to update all code/comments to reflect this change. 

Reproducing experiments
-----------------------
1. Ensure data is present under `data/` (either synthetic pickles in `data/tsp_uniform/` or TSPLIB pickles in `data/tsplib/`). If missing, create synthetic instances using `data/generate_data.py` (inspect that script for options).

2. Create the environment using `environment.yml` or pip-install the dependencies.

3. Run the desired top-level script(s). For long-running experiments, toggle `run = False` when you only want to regenerate tables from cached CSVs.


Acknowledgements
----------------
The decoding strategies and experimental setup are inspired by recent work on using graph-based neural predictors for 
combinatorial problems, more specifically we got ou predictions from the following papers:
- DIFUSCO → Sun, Zhiqing, and Yiming Yang. "Difusco: Graph-based diffusion solvers for combinatorial optimization." Advances in neural information processing systems 36 (2023): 3706-3731.
- GNN-AR → Joshi, Chaitanya K., et al. "Learning the travelling salesperson problem requires rethinking generalization." Constraints 27.1 (2022): 70-98.
- GNN-GLS → Hudson, Benjamin, et al. "Graph Neural Network Guided Local Search for the Traveling Salesperson Problem." International Conference on Learning Representations (2022)
- SoftDist → Xia, Yifan, et al. "Position: Rethinking Post-Hoc Search-Based Neural Approaches for Solving Large-Scale Traveling Salesman Problems." Proceedings of the 41st International Conference on Machine Learning, (2024), pp. 54178-90.
