# Condensed: BGE-EN-ICL

Summary: This tutorial demonstrates the implementation of BGE-EN-ICL, a Mistral-7B-based embedding model that supports both zero-shot and few-shot learning. It covers essential techniques for model initialization, custom pooling using [EOS] tokens, and in-context learning implementation using the FlagEmbedding library. The tutorial helps with tasks like generating embeddings for queries and documents, handling attention masks, and setting up few-shot learning with formatted examples. Key features include the model's unique pooling method for unidirectional attention, support for both zero-shot and few-shot scenarios, and flexible example formatting for in-context learning tasks.

*This is a condensed version that preserves essential implementation details and context.*

Here's the condensed tutorial focusing on essential implementation details:

# BGE-EN-ICL Tutorial

## Installation
```python
pip install -U transformers FlagEmbedding
```

## Key Components & Implementation

### 1. Model Structure
- Based on Mistral-7B (decoder-only LLM)
- Different from previous BGE models which used encoder-only architecture

```python
from transformers import AutoTokenizer, AutoModel
import torch

tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-en-icl")
raw_model = AutoModel.from_pretrained("BAAI/bge-en-icl")
```

### 2. Pooling Method
- Uses [EOS] token's output embedding
- Addresses mismatch between unidirectional and bidirectional attention

Key Implementation:
```python
def last_token_pool(last_hidden_states: torch.Tensor,
                    attention_mask: torch.Tensor) -> torch.Tensor:
    left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
    if left_padding:
        return last_hidden_states[:, -1]
    else:
        sequence_lengths = attention_mask.sum(dim=1) - 1
        batch_size = last_hidden_states.shape[0]
        return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]
```

### 3. In-Context Learning Usage
```python
from FlagEmbedding import FlagICLModel

model = FlagICLModel('BAAI/bge-en-icl', 
                     examples_for_task=examples,  # Optional: None for zero-shot
                     examples_instruction_format="<instruct>{}\n<query>{}\n<response>{}", 
                     devices=[0])

# Generate embeddings
query_embeddings = model.encode_queries(queries)
doc_embeddings = model.encode_corpus(documents)
similarity = query_embeddings @ doc_embeddings.T
```

## Important Notes
- Supports both zero-shot and few-shot embedding capabilities
- For zero-shot: Use similar to BGE v1&1.5
- For few-shot: Provide examples in specified format
- Examples should include instruction, query, and response components

## Best Practices
- Use appropriate example format for few-shot learning
- Consider computational resources when setting up devices
- Evaluate model performance in both zero-shot and few-shot scenarios