# Condensed: BGE Reranker

Summary: This tutorial provides implementation guidance for BGE Rerankers, focusing on model initialization and scoring computation across different versions. It covers techniques for working with various reranker models (base, large, v2 series) using the FlagEmbedding library, including specialized implementations like LayerWise and LightWeight rerankers. The tutorial helps with tasks involving text pair scoring and reranking, particularly for search and retrieval applications. Key features include multilingual support, performance optimization techniques (FP16, layer selection, compression), and model selection guidelines based on specific use cases, with code examples demonstrating different initialization patterns and scoring methods for each model variant.

*This is a condensed version that preserves essential implementation details and context.*

Here's the condensed tutorial focusing on essential implementation details:

# BGE Reranker Implementation Guide

## Installation
```python
pip install -U FlagEmbedding
```

## Key Models and Implementations

### 1. First Generation BGE Rerankers
- **Models Available**: 
  - `bge-reranker-base` (278M params, Chinese/English)
  - `bge-reranker-large` (560M params, Chinese/English)

**Basic Implementation:**
```python
from FlagEmbedding import FlagReranker

model = FlagReranker(
    'BAAI/bge-reranker-large',
    use_fp16=True,  # Performance optimization
    devices=["cuda:0"]  # Use "cpu" if no GPU
)

# Example usage
scores = model.compute_score([
    ["query1", "passage1"],
    ["query2", "passage2"]
])
```

### 2. BGE Reranker v2 Series

#### a. bge-reranker-v2-m3 (568M, Multilingual)
```python
from FlagEmbedding import FlagReranker

reranker = FlagReranker('BAAI/bge-reranker-v2-m3', devices=["cuda:0"], use_fp16=True)
score = reranker.compute_score(['query', 'passage'], normalize=True)  # normalize for 0-1 range
```

#### b. bge-reranker-v2-gemma (2.51B, Multilingual)
```python
from FlagEmbedding import FlagLLMReranker

reranker = FlagLLMReranker('BAAI/bge-reranker-v2-gemma', devices=["cuda:0"], use_fp16=True)
```

#### c. bge-reranker-v2-minicpm-layerwise (2.72B, Multilingual)
```python
from FlagEmbedding import LayerWiseFlagLLMReranker

reranker = LayerWiseFlagLLMReranker('BAAI/bge-reranker-v2-minicpm-layerwise', 
                                   devices=["cuda:0"], 
                                   use_fp16=True)
score = reranker.compute_score(['query', 'passage'], cutoff_layers=[28])
```

#### d. bge-reranker-v2.5-gemma2-lightweight (9.24B, Multilingual)
```python
from FlagEmbedding import LightWeightFlagLLMReranker

reranker = LightWeightFlagLLMReranker('BAAI/bge-reranker-v2.5-gemma2-lightweight', 
                                     devices=["cuda:0"], 
                                     use_fp16=True)
score = reranker.compute_score(['query', 'passage'], 
                             cutoff_layers=[28], 
                             compress_ratio=2, 
                             compress_layers=[24, 40])
```

## Best Practices

1. **Model Selection Guidelines**:
   - Multilingual: Use v2-m3, v2-gemma, or v2.5-gemma2-lightweight
   - Chinese/English: Use v2-m3 or v2-minicpm-layerwise
   - Efficiency-focused: Use v2-m3 or lower layers of v2-minicpm-layerwise
   - Resource-constrained: Use base or large models
   - Best performance: Use v2-minicpm-layerwise or v2-gemma

2. **Performance Optimization**:
   - Enable `use_fp16=True` when using GPU
   - Utilize layer selection in supported models
   - Use compression options in lightweight models

3. **Important Note**: Always test models on your specific use case to find the optimal speed-quality balance.