This is the data and code necessary to reproduce the main results in the paper "Unsupervied Learning with Network-Aware Embeddings"

You are going to find one folder and four main scripts.

The "data" folder contains the data for Figure 4. There are three datasets: tradeatlas, littlesis and tivoli. For all three datasets you have two files.

- "*_network.csv" files contain the graph structure in edgelist format. The first column is the first node in the edge and the second column is the second node in the edge. The networks are undirected. Only for tivoli, you have a third column with the weight of the edge.
- "*_nodeattributes.csv" files contain the node attributes. These are one value per node for the first n columns (with n being the number of nodes in the network). The last column, called "label" represent the ground truth, the objective of the prediction.

The four scripts are the following:

- "network_clustering.py": this is the library developed for the paper and should not be run directly.
- "01_fig_3_tab_a1.py": this reproduces the results on synthetic networks, for Figure 3 and Table A1 in the main paper. Runs without parameters, with Python.
- "02_fig_4.py": this reproduces the results on real world networks, for Figure 4 in the main paper. Runs without parameters, with Python.
- "03_fig_5.jl": this reproduces the runtime scalability results, for Figure 5 in the main paper. Runs without parameters, with Julia.

These are the dependencies you need to have to successfully run all experiments:

Python: numpy, scipy, pandas, networkx, sklearn, pytorch, pytorch geometric.
Julia: Laplacians, LinearAlgebra, Graphs, SimpleWeightedGraphs.
