Thank you for reading!

The supplementary materials have the following folders.

The subfolder "QM9graphs_50closest_pairs" has 50 subfolders each containing a pair of (near-)duplicate molecules in QM9 given by xyz files that can be opened by Paraview.

The subfolder "QM9graphs_43deapest_minima" has 43 xyz files of the most stable molecules that have the minimum free energy G barrier over 5 neighbours at least 5 in in QM9.

The largest table QM9graphs_SDV_free_energy_deepest_minima.csv contains the key information about 5 nearest neighbors of all QM molecules. The most important (final) column is the minimum barrier (min difference of neighbour's energy minus its own energy). 

The code for the invariants SDV, PDD, SCD is in the folder C++invariants. 

The code for the distances Linf, EMD, bottleneck-based SBM is in graph_scripts.ipynb. 

The specifications of the computer. Processor: Ryzen 9 3950X with 64 MB of L3 cache, 3.5 GHz, RAM 82GB, speed: 3200 MT/s (2x32 GB + 2x8 GB).   

The running times below are in seconds.

Linf distance on SDV invariants for all 873,527,974 pairs of molecules with the same number of atoms (1 category for a fixed number of atoms per core): 56143 sec = 15.6 hours.

EMD distance on PDD invariants for 1%=8,735,279 pairs of closest molecules: 58827 sec = 16.3 hours  

SCD invariants for 10K pairs of closest molecules: 530422 sec = 147 hours, SBM distance: 134 sec.