# Gaussian Process Visualization on Cora Graph

This set of scripts visualizes Gaussian processes on the Cora citation graph with different smoothness parameters.

## Background

A Gaussian process on a graph is defined by its mean (here set to 0) and covariance structure. The covariance structure is derived from the graph Laplacian (Λ) using the following formula:

```
Covariance = (2ν/κ² + Λ)^(-ν)
```

where:
- Λ (Lambda) is the graph Laplacian
- ν (nu) is the smoothness parameter
- κ (kappa) is the scale parameter

## Scripts

### 1. `visualize_gaussian_process_on_cora.py`

This script:
1. Loads the Cora citation network dataset
2. Extracts a connected subgraph for better visualization (200 nodes)
3. Computes the normalized graph Laplacian
4. Generates Gaussian process samples with two different values of nu (0.01 and 100)
5. Visualizes these samples on the graph with:
   - Edges colored by covariance strength (blue intensity)
   - Nodes colored either by:
     - Class labels from Cora dataset (categorical colors)
     - Gaussian process values (continuous colormap)

### 2. `compare_covariance_structures.py`

This script provides a deeper analysis of how the smoothness parameter (nu) affects the covariance structure:
1. Visualizes the full covariance matrices for different values of nu
2. Shows how covariance propagates from a reference node to other nodes, with:
   - Edge colors showing covariance strength
   - Node colors showing either class labels or covariance with reference node
3. Plots the relationship between graph distance and covariance strength

## Key Insights

### Effect of the Smoothness Parameter (ν)

- **Small ν (e.g., 0.01)**: 
  - Creates rough, less smooth random fields
  - Covariance decays quickly with distance
  - Neighboring nodes can have very different values
  - Local, short-range dependencies dominate
  - Edge colors show strong local correlations but weak distant correlations

- **Large ν (e.g., 100)**:
  - Creates very smooth random fields
  - Covariance decays slowly with distance
  - Values change gradually across the graph
  - Long-range dependencies are strong
  - Edge colors show more uniform, long-range correlations across the graph

### Interpretation

The parameter ν in the Matérn-like kernel controls the smoothness of the Gaussian process:

- As ν → 0: The process approaches a white noise process
- As ν → ∞: The process approaches a very smooth process

This mirrors the behavior of the classical Matérn kernel in Euclidean space, but adapted to the graph domain where "distance" is measured by the graph Laplacian.

## Usage

To run the scripts:

```
python visualize_gaussian_process_on_cora.py
python compare_covariance_structures.py
```

The results are saved in the `results/` directory:
- `cora_gaussian_process_nu_0.01.png` - Gaussian process with ν = 0.01, nodes colored by class
- `cora_gaussian_process_nu_100.png` - Gaussian process with ν = 100, nodes colored by class
- `cora_gaussian_process_values_nu_0.01.png` - Gaussian process with ν = 0.01, nodes colored by GP values
- `cora_gaussian_process_values_nu_100.png` - Gaussian process with ν = 100, nodes colored by GP values
- `covariance_matrices_comparison.png` - Visualization of covariance matrices
- `covariance_structure_comparison.png` - Visualization of covariance propagation (nodes by covariance)
- `covariance_structure_labeled_comparison.png` - Visualization of covariance propagation (nodes by class)
- `covariance_vs_distance.png` - Plot of covariance vs. graph distance

## Requirements

- PyTorch
- PyTorch Geometric
- NetworkX
- Matplotlib
- NumPy
- SciPy
- Seaborn 