# Installation
Environment Setup
conda create -n difflig python=3.10
conda activate difflig
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
pip install xarray netCDF4 geopandas shapely matplotlib pandas numpy scikit-learn torch-scatter torch-sparse torch-cluster torch-spline-conv torch-geometric

# Data Preparation
To reproduce the results of DiffLiG, users are required to obtain both reanalysis data and ground-based station observations:

## Gridded Data
ERA5 Gridded Data: Hourly precipitation and other meteorological variables can be downloaded from the Copernicus Climate Data Store (CDS). We recommend using the CDS API to automate bulk downloads. After retrieval, data should be processed and stored in NetCDF format under the ERA5/Processed directory.Alternatively, gridded inputs from AI-based models (e.g., FourCastNet, GraphCast) or traditional NWP systems (e.g., IFS-HRES, GFS) can also be used, as long as they are aligned in format and resolution.
## Station Data
Station Observation Data: Ground truth measurements are obtained from national meteorological agencies. In our experiments, we use station-level precipitation data provided by the China Meteorological Administration (CMA). Due to data usage restrictions, access may require formal authorization. The processed station files are expected to be placed under station/processed, and associated shapefiles under station/stations.
## Shapefiles
Region Shapefiles: Geographic boundary shapefiles are used for regional filtering and visualization. These can be sourced from GADM or local GIS databases and should be organized under Shapefiles/Regions.

All datasets must be spatially and temporally aligned before training. Preprocessing scripts are provided to assist with format conversion, spatial interpolation, and normalization.

# Data Structure
DiffLiG expects the following data structure:
RootDataPath/
├── ERA5
│   └── Processed
│       ├── era5_{year}.nc
│       └── ...
├── Shapefiles
│   └── Regions
│       ├── region_name.cpg
│       ├── region_name.dbf
│       ├── region_name.prj
│       ├── region_name.shp
│       └── region_name.shx
└── station
    ├── processed
    │   └── Meta--{year_start}--{year_end}
    │       ├── station_{year}.nc4
    │       └── ...
    └── stations
        ├── stations_{year_range}.cpg
        ├── stations_{year_range}.dbf
        ├── stations_{year_range}.prj
        ├── stations_{year_range}.shp
        └── stations_{year_range}.shx

# Code Organization
Source/
├── DiffLiG_Launcher.py    # Main entry point for running the model
├── GNN_Main.py        # Contains the main training and evaluation loop
├── Dataloader/
│   ├── ERA5.py            # Loader for ERA5 gridded data
│   ├── station.py         # Loader for station observation data
│   ├── Metadata.py        # Manages and filters weather station metadata.
│   └── MixData.py         # Combines ERA5 and station data
├── Network/
│   ├── stationNetwork.py  # Handles station connectivity
│   └── ERA5Network.py     # Processes ERA5 grid connectivity
├── Modules/               # Contains building blocks for the model
├── EvaluateModel.py       # Functions for model evaluation and metrics
└── Normalization/         # Contains data normalization utilities

# Usage
python Source/DiffLiG_Launcher.py
