# Official code for NovoBench


# Installation
```bash
conda create -n novobench python=3.10
conda activate novobench
pip install torch  # Please install the correct version of torch according to your CUDA version
pip install scipy transformers tqdm scikit-learn wandb torchmetrics biopython
```

# Usage

## Prepare the embeddings

Assign the correct path for the dataset folder and the folder to save the embeddings.

```bash
# Install other dependencies for different biological language models
python utils/get_embed/extract.py --cfg_path config/embed/cfg_embed_ESM2.py
```

## Evaluate BLMs
Assign the correct path for the dataset folder and the embedding folder.

Evaluate BLMs on NovoBench:
```bash
python run.py --cfg_path config/diff_llms.py
```

Fine-tune the BLMs:
```bash
python run.py --cfg_path config/finetune.py
```