# star-embed

This is the submission of code to reproduce the experiments of the paper

## Requirement Installation

1. Install conda following this documentation: https://docs.conda.io/projects/conda/en/latest/user-guide/install/index.html

2. Install requirements:
```bash
conda create -n starembed python=3.11
conda activate starembed
pip install -r requirements.txt
cd moirai
pip install -e .
cd ..
```

## Datasets

Raw dataset: 
- ZTF
- CSDR1

The script to cross-match the ZTF and CSDR1 stars: scripts/query_csdr1_from_ztf.py
The script to generate the embeddings using different pretrained models for each star in the train-val-test splits are as follows. To reproduce the experiment results, there is no need to rerun the embedding generation scripts. We have uploaded the generated embeddings to: https://huggingface.co/datasets/123anonymous123/StarEmbed. They can be downloaded and directly loaded into the code to execute clustering and classification.

For MOIRAI:

```
moirai/exps/main_moirai_embs_for_train_test_split.py
```

For Chronos and Chronos-Bolt:
```
scripts/chronos_embs_on_raw_train_test_split.py
```

For ASTROMER:
```
scripts/astromer_embs_on_raw_train_test_split.py
```

For Random Embeddings:
```
scripts/random_embs_on_raw_train_test_split.py
```

For Hand-crafted Features:
```scripts/hand_crafted_feats_on_raw_train_test_split_step1.py
scripts/hand_crafted_feats_on_raw_train_test_split_step2.py
```



To run the above two hand-crafted feature scripts, the following commands need to be run first to install the FATS library due to the reason that the FATS package on github is super old and still on python 2 but there’s a pending pull request on github that migrates most things to python 3:

```
clone FATS (https://github.com/isadoranun/FATS)
cd into FATS
gh pr checkout 13 (check out pull request #13, you’ll need to install and login to github CLI)
pip install -e . (edited) 
# then add the FATS directory location to sys env using sys.path.append in the script
```




## Benchmark Experiments

All the bash script is located at `starembed_submission\src\starembed\eval\src\bash_script`, and all the python file is located at `starembed_submission\src\starembed\eval\src\benchmark`.


### Clustering
To run the clustering task to reproduce the results, execute the bash script at 
```
sbatch src\starembed\eval\src\bash_script\clustering\slurm_clustering.sh
```


### Classification

Take linear classifier for example:
```
sbatch src\starembed\eval\src\bash_script\classification\slurm_linear.sh
```
The script for other classifiers can also be found under `src\starembed\eval\src\bash_script\classification\`. 

### OOD

Please see the code located at `starembed_submission\src\starembed\eval\src\benchmark\ood`.
