# Condensed: Fine-tuning

Summary: This tutorial provides implementation guidance for fine-tuning BGE (BAAI General Embedding) models using the FlagEmbedding library. It covers essential technical details for model configuration, data processing, and training parameters, specifically focusing on embedding model fine-tuning tasks. Key functionalities include setting up model checkpoints, handling data formats (query-positive-negative triplets), managing sequence lengths, and configuring training parameters like learning rate and batch size. The tutorial is particularly useful for tasks involving custom embedding model training, with specific features including knowledge distillation, gradient checkpointing, FP16 training, and distributed training optimization using deepspeed.

*This is a condensed version that preserves essential implementation details and context.*

Here's the condensed tutorial focusing on essential implementation details:

# Fine-tuning BGE Models

## Critical Setup
```bash
pip install -U FlagEmbedding[finetune]
```

## Key Implementation Parameters

### Core Model Configuration
- `model_name_or_path`: Initial model checkpoint
- `cache_dir`: Storage location for pre-trained models
- `trust_remote_code`: Enable remote code execution

### Data Processing Parameters
- `train_data`: Training data paths (Required format: `query: str`, `pos: List[str]`, `neg: List[str]`)
- `query_max_len`: Max query sequence length (post-tokenization)
- `passage_max_len`: Max passage sequence length (post-tokenization)
- `train_group_size`: Batch grouping size
- `knowledge_distillation`: Enable KD when `pos_scores` and `neg_scores` are available

### Critical Training Parameters
```python
# Essential configurations
query_instruction_for_retrieval = 'Represent this sentence for searching relevant passages: '
temperature = 0.02
sentence_pooling_method = 'cls'  # Options: cls, mean, last_token
learning_rate = 1e-5
num_train_epochs = 2
per_device_train_batch_size = 2
```

## Example Implementation
```bash
torchrun --nproc_per_node 2 \
    -m FlagEmbedding.finetune.embedder.encoder_only.base \
    --model_name_or_path BAAI/bge-large-en-v1.5 \
    --train_data ./ft_data/training.json \
    --query_max_len 512 \
    --passage_max_len 512 \
    --learning_rate 1e-5 \
    --num_train_epochs 2 \
    --per_device_train_batch_size 2 \
    --temperature 0.02 \
    --normalize_embeddings True \
    --gradient_checkpointing \
    --fp16
```

## Best Practices
1. Use gradient checkpointing for memory efficiency
2. Enable FP16 for faster training
3. Set appropriate warmup_ratio (recommended: 0.1)
4. Normalize embeddings for better performance
5. Use deepspeed for distributed training optimization

## Important Notes
- Ensure training data follows required format
- Monitor logging_steps and save_steps for training progress
- Consider using negatives_cross_device for better negative sampling
- Adjust batch size based on available GPU memory