Summary: This tutorial demonstrates the implementation of BGE Auto Embedder, focusing on text embedding generation for information retrieval tasks. It provides code examples for initializing and using the FlagAutoModel to encode queries and passages, with specific implementation details for model configuration through EmbedderConfig dataclass. The tutorial covers essential techniques for customizing embedding models, including pooling methods, device allocation, and query instruction formatting. It helps with tasks involving text embedding generation, similarity scoring, and model customization, supporting multiple embedding model families (BGE, E5, GTE, SFR). Key features include flexible model initialization, query/corpus encoding, and configuration management through a standardized interface.

# BGE Auto Embedder

FlagEmbedding provides a high level class `FlagAutoModel` that unify the inference of embedding models. Besides BGE series, it also supports other popular open-source embedding models such as E5, GTE, SFR, etc. In this tutorial, we will have an idea how to use it.


```python
% pip install FlagEmbedding
```

## 1. Usage

First, import `FlagAutoModel` from FlagEmbedding, and use the `from_finetuned()` function to initialize the model:


```python
from FlagEmbedding import FlagAutoModel

model = FlagAutoModel.from_finetuned(
    'BAAI/bge-base-en-v1.5',
    query_instruction_for_retrieval="Represent this sentence for searching relevant passages: ",
    devices="cuda:0",   # if not specified, will use all available gpus or cpu when no gpu available
)
```

Then use the model exactly same to `FlagModel` (`FlagM3Model` if using BGE M3, `FlagLLMModel` if using BGE Multilingual Gemma2, `FlagICLModel` if using BGE ICL)


```python
queries = ["query 1", "query 2"]
corpus = ["passage 1", "passage 2"]

# encode the queries and corpus
q_embeddings = model.encode_queries(queries)
p_embeddings = model.encode_corpus(corpus)

# compute the similarity scores
scores = q_embeddings @ p_embeddings.T
print(scores)
```

## 2. Explanation

`FlagAutoModel` use an OrderedDict `MODEL_MAPPING` to store all the supported models configuration:


```python
from FlagEmbedding.inference.embedder.model_mapping import AUTO_EMBEDDER_MAPPING

list(AUTO_EMBEDDER_MAPPING.keys())
```


```python
print(AUTO_EMBEDDER_MAPPING['bge-en-icl'])
```

Taking a look at the value of each key, which is an object of `EmbedderConfig`. It consists four attributes:

```python
@dataclass
class EmbedderConfig:
    model_class: Type[AbsEmbedder]
    pooling_method: PoolingMethod
    trust_remote_code: bool = False
    query_instruction_format: str = "{}{}"
```

Not only the BGE series, it supports other models such as E5 similarly:


```python
print(AUTO_EMBEDDER_MAPPING['bge-en-icl'])
```

## 3. Customization

If you want to use your own models through `FlagAutoModel`, consider the following steps:

1. Check the type of your embedding model and choose the appropriate model class, is it an encoder or a decoder?
2. What kind of pooling method it uses? CLS token, mean pooling, or last token?
3. Does your model needs `trust_remote_code=Ture` to ran?
4. Is there a query instruction format for retrieval?

After these four attributes are assured, add your model name as the key and corresponding EmbedderConfig as the value to `MODEL_MAPPING`. Now have a try!
