# Condensed: BGE Series

Summary: This tutorial provides implementation guidance for the BGE (BAAI General Embedding) series of embedding models, specifically helping with text embedding and retrieval tasks. It covers code examples for different BGE variants including base models, BGE v1.5, BGE M3 (multi-functional/lingual/granularity), BGE Multilingual Gemma2, and BGE ICL (in-context learning). Key functionalities include dense/sparse retrieval, multilingual support (100+ languages), long text processing (up to 8192 tokens), and few-shot learning capabilities. The tutorial demonstrates essential implementation patterns using the FlagEmbedding library, including model initialization, query/corpus encoding, and parameter configuration for optimization.

*This is a condensed version that preserves essential implementation details and context.*

Here's the condensed tutorial focusing on essential implementation details and key concepts:

# BGE Series Tutorial

## Key Models & Versions

### 1. BGE Base Models
- Available in large/base/small sizes for English and Chinese
- Uses BERT architecture
- Sizes range from 24M-500M parameters

### 2. BGE v1.5 
- Improved similarity distribution
- Enhanced retrieval without instruction
- Same usage pattern as v1

### 3. BGE M3
- Multi-Functional: Dense retrieval, multi-vector retrieval, sparse retrieval
- Multi-Lingual: 100+ languages
- Multi-Granularity: Up to 8192 tokens

### 4. BGE Multilingual Gemma2
- LLM-based multilingual embedding model
- 9.24B parameters

### 5. BGE ICL
- In-context learning capabilities
- Based on Mistral-7B
- 7.11B parameters

## Implementation Examples

### Basic Usage (BGE v1/v1.5)
```python
from FlagEmbedding import FlagModel

model = FlagModel(
    'BAAI/bge-base-en',
    query_instruction_for_retrieval="Represent this sentence for searching relevant passages:",
    query_instruction_format='{}{}'
)

# Encode queries and corpus
q_embeddings = model.encode_queries(queries)
p_embeddings = model.encode_corpus(corpus)
scores = q_embeddings @ p_embeddings.T
```

### BGE M3 Usage
```python
from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel('BAAI/bge-m3', use_fp16=True)

embeddings = model.encode(
    sentences, 
    max_length=8192,
    return_dense=True, 
    return_sparse=True, 
    return_colbert_vecs=True
)
```

### BGE ICL Usage
```python
from FlagEmbedding import FlagICLModel

model = FlagICLModel('BAAI/bge-en-icl', 
                     examples_for_task=examples)  # Optional few-shot examples

embeddings_1 = model.encode_queries(queries)
embeddings_2 = model.encode_corpus(documents)
```

## Important Parameters

- `max_length`: Controls input token length (default varies by model)
- `batch_size`: For batch processing (default=256)
- `use_fp16`: Enable for faster computation with slight performance trade-off
- `query_instruction_for_retrieval`: Prefix for query encoding
- `convert_to_numpy`: Convert outputs to numpy arrays (default=True)

## Best Practices

1. Use appropriate model size based on requirements:
   - Small for speed/resource constraints
   - Large for maximum performance

2. Consider BGE M3 for:
   - Multilingual applications
   - Long document processing
   - Multiple retrieval methods

3. Use BGE ICL when:
   - Task-specific adaptation is needed
   - Few-shot examples are available

4. Enable `use_fp16=True` for faster inference when slight performance degradation is acceptable