# Condensed: Evaluation Using Sentence Transformers

Summary: This tutorial demonstrates how to implement information retrieval evaluation using Sentence Transformers, specifically focusing on the BeIR Quora dataset. It covers techniques for loading and preprocessing evaluation datasets, creating efficient corpus subsets, and establishing query-document relevance mappings. The tutorial helps with tasks like setting up IR evaluation pipelines, managing large document collections, and computing retrieval metrics. Key functionalities include working with the SentenceTransformer model, utilizing the InformationRetrievalEvaluator, and handling dataset relationships through dictionary mappings and set operations.

*This is a condensed version that preserves essential implementation details and context.*

Here's the condensed tutorial focusing on essential implementation details:

# Sentence Transformers Evaluation Guide

## Setup
```python
from sentence_transformers import SentenceTransformer
from sentence_transformers.evaluation import InformationRetrievalEvaluator
from datasets import load_dataset
import random

# Initialize model
model = SentenceTransformer('all-MiniLM-L6-v2')
```

## Information Retrieval Evaluation

### 1. Load Datasets
```python
# Load BeIR Quora datasets
corpus = load_dataset("BeIR/quora", "corpus", split="corpus")
queries = load_dataset("BeIR/quora", "queries", split="queries")
relevant_docs_data = load_dataset("BeIR/quora-qrels", split="validation")
```

### 2. Prepare Evaluation Data
```python
# Create subset of corpus (relevant docs + 10k random samples)
required_corpus_ids = list(map(str, relevant_docs_data["corpus-id"]))
required_corpus_ids += random.sample(corpus["_id"], k=10_000)
corpus = corpus.filter(lambda x: x["_id"] in required_corpus_ids)

# Convert to dictionaries
corpus = dict(zip(corpus["_id"], corpus["text"]))  # cid => document
queries = dict(zip(queries["_id"], queries["text"]))  # qid => question

# Create relevance mapping
relevant_docs = {}  # qid => set([relevant_cids])
for qid, corpus_ids in zip(relevant_docs_data["query-id"], relevant_docs_data["corpus-id"]):
    qid = str(qid)
    corpus_ids = str(corpus_ids)
    if qid not in relevant_docs:
        relevant_docs[qid] = set()
    relevant_docs[qid].add(corpus_ids)
```

### 3. Run Evaluation
```python
ir_evaluator = InformationRetrievalEvaluator(
    queries=queries,
    corpus=corpus,
    relevant_docs=relevant_docs,
    name="BeIR-quora-dev",
)

results = ir_evaluator(model)
```

Key Points:
- Uses Sentence Transformers for information retrieval evaluation
- Evaluates on BeIR Quora dataset
- Creates a manageable subset of corpus for efficient evaluation
- Maintains mapping between queries and relevant documents
- Uses InformationRetrievalEvaluator for computing IR metrics