# LD-SDM: Language-Driven Hierarchical Species Distribution Modeling

## 👨‍💻 Getting Started 

#### Installing Required Packages
There are two options to setup your environment to be able to run all the functions in the repository:
1. Using Dockerfile provided in the repository to create a docker image with all required packages:
    ```
    docker build -t <your-docker-hub-id>/ldsdm .
    ```
2. Creating conda Environment with all required packages:
    ```
    conda create -n ldsdm python=3.10 && \
    conda activate ihsdm && \
    pip install requirements.txt
    ```

## 🔥 Training Models
1. Setup all the parameters of interest inside `config.py` before launching the training script.
2. Run training by calling:
    ```
    python train.py --expt_name=<your-expt-name>
    ``` 

## 🦩 Generating Predictions
To generate predictions for a model in the form of a world map, do the following:

1. Setup the parameters for the best model inside `config.py` `(cfg.best_model)`
2. Then run the following command:

    ```
    python predict_range.py --model_path=<path-to-model> --label=<species_name> --threshold=<binary-threshold>
    ```
3. You can also generate predictions for higher taxonomical levels by specifying the argument `--taxonomy_level` in the above function.

## 🏃🏻‍♂️ Evaluation
There are three tasks we evaluate on. The procedure for evaluation is given below:

1. In-domain Species Range Prediction: Use the script `eval.py` and specify the path to testing parquet file in `config.py`.
2. Zero-shot Species Range Prediction: Use the same procedure as 1 and specify the parquet file containing unseen species.
3. Geo-Feature Regression: Use the script `utils/viz_pos_embeds.py` to export the location embeddings learned by the model. Then use the evaluation scipt in [SINR](https://github.com/elijahcole/sinr).