# AFD-INSTRUCTION: A Comprehensive Antibody Instruction Dataset with Functional Annotations for LLM-Based Understanding and Design

This repository contains the implementation code for antibody instruction dataset with functional annotations for LLM-based understanding and design.

## Installation

### Prerequisites

- Python 3.8+
- CUDA-compatible GPU (recommended)
- Conda package manager

### Environment Setup

1. **Create conda environment from environment.yaml:**
```bash
conda env create -f environment.yaml
conda activate afd
```

2. **Install additional dependencies not in environment.yaml:**
```bash
# Install Unsloth for efficient training
pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"

# Install API dependencies
pip install openai python-dotenv

# Optional: Install PyRosetta for structural analysis
pip install pyrosetta-installer
python -c "import pyrosetta_installer; pyrosetta_installer.install_pyrosetta()"
```

### Verify Installation

```bash
python -c "import torch; print(f'PyTorch version: {torch.__version__}')"
python -c "import transformers; print(f'Transformers version: {transformers.__version__}')"
python -c "from unsloth import FastLanguageModel; print('Unsloth installed successfully')"
```

## Project Structure

```
├── data/              # Training and evaluation data
├── finetune/          # Model fine-tuning code
├── benchmark/         # Evaluation benchmarks
├── eval/             # Evaluation metrics
├── scripts/          # Utility scripts
├── environment.yml   # Conda environment file
└── README.md         # This file
```
