Summary: This tutorial provides implementation guidance for Faiss, a vector similarity search library, covering four main index types: IndexFlat (exact search), IndexIVF (clustering-based), IndexHNSW (graph-based), and IndexLSH (hash-based). It helps with tasks involving nearest neighbor search in vector spaces, particularly useful for recommendation systems and similarity matching. Key features include installation instructions, index-specific parameter tuning (nlist, nprobe, M, efConstruction), best practices for performance optimization, and basic usage patterns with numpy arrays. The tutorial emphasizes trade-offs between speed, accuracy, and memory usage across different index types, helping developers choose the appropriate index for their specific use case.

# Faiss Indexes

This tutorial will go through several widely used indexes in Faiss that fits different requirements, and how to use them.

## Preparation

For CPU usage, use:


```python
%pip install faiss-cpu
```

For GPU on Linux x86_64 system, use Conda:

```conda install -c pytorch -c nvidia faiss-gpu=1.8.0```


```python
import faiss
import numpy as np

np.random.seed(768)

data = np.random.random((1000, 128))
```

## 1. `IndexFlat*`

Flat index is the very fundamental index structure. It does not do any preprocess for the incoming vectors. All the vectors are stored directly without compression or quantization. Thus no training is need for flat indexes.

When searching, Flat index will decode all the vectors sequentially and compute the similarity score to the query vectors. Thus, Flat Index guarantees the global optimum of results.

Flat index family is small: just `IndexFlatL2` and `IndexFlatIP`, which are just different by the similarity metrics of Euclidean distance and inner product.

Usage:


```python
d = 128  # dimension of the vector
k = 3    # number of nearest neighbors to search

# just simply create the index and add all the data
index = faiss.IndexFlatL2(d)
index.add(data)
```

Sanity check:


```python
# search for the k nearest neighbor for the first element in data
D, I = index.search(data[:1], k)

print(f"closest elements: {I}")
print(f"distance: {D}")
```

Flat Indexes guarantee the perfect quality but with terrible speed. It works well on small datasets or the cases that speed is not a crucial factor. 

But what about the cases that speed is important? There's no way to have it all. So we want some indexes that only sacrifice as small as possible quality to speed up. That's why approximate nearest-neighbors (ANN) algorithms are widely accepted. Now we will go through a few popular ANN methods used in vector searching.

## 2. `IndexIVF*`

### Intro

Inverted File Flat (IVF) Index is a widely accepted technique to speed up searching by using k-means or Voronoi diagram to create a number of cells (or say, clusters) in the whole space. Then when given a query, an amount of closest cells will be searched. After that, `k` closest elements to the query will be searched in those cells.

- `quantizer` is another index/quantizer to assign vectors to inverted lists.
- `nlist` is the number of cells the space to be partitioned.
- `nprob` is the nuber of closest cells to visit for searching in query time.

### Tradeoff

Increasing `nlist` will shrink the size of each cell, which speed up the search process. But the smaller coverage will sacrifice accuracy and increase the possibility of the edge/surface problem discribed above.

Increasing `nprob` will have a greater scope, preferring search quality by the tradeoff of slower speed.

### Shortage

There could be a problem when the query vector lands on the edge/surface of the cell. It is possible that the closest element falls into the neighbor cell, which may not be considered due to `nprob` is not large enough.

### Example


```python
nlist = 5
nprob = 2

# the quantizer defines how to store and compare the vectors
quantizer = faiss.IndexFlatL2(d)
index = faiss.IndexIVFFlat(quantizer, d, nlist)

# note different from flat index, IVF index first needs training to create the cells
index.train(data)
index.add(data)
```


```python
# set nprob before searching
index.nprobe = 8
D, I = index.search(data[:1], k)

print(f"closest elements: {I}")
print(f"distance: {D}")
```

## 3. `IndexHNSW*`

### Intro

Hierarchical Navigable Small World (HNSW) indexing is a graph based method, which is an extension of navigable small world (NSW). It builds a multi-layered graph where nodes (vectors) are connected based on their proximity, forming "small-world" structures that allow efficient navigation through the space.

- `M` is the number of neighbors each vector has in the graph.
- `efConstruction` is the number of entry points to explore when building the index.
- `efSearch` is the number of entry points to explore when searching.

### Tradeoff

Increasing `M` or `efSearch` will make greater fidelity with reasonable longer time. Larger `efConstruction` mainly increases the index construction time.

HNSW has great searching quality and speed. But it is memory-consuming due to the graph structure. Scaling up `M` will cause a linear increase of memory usage.

Note that HNSW index does not support vector's removal because removing nodes will distroy graph structure.

Thus HNSW is a great index to choose when RAM is not a limiting factor.

### Example


```python
M = 32
ef_search = 16
ef_construction = 32

index = faiss.IndexHNSWFlat(d, M)
# set the two parameters before adding data
index.hnsw.efConstruction = ef_construction
index.hnsw.efSearch = ef_search

index.add(data)
```


```python
D, I = index.search(data[:1], k)

print(f"closest elements: {I}")
print(f"distance: {D}")
```

### 4. `IndexLSH`

### Intro

Locality Sensitive Hashing (LSH) is an ANN method that hashing data points into buckets. While well known use cases of hash function such as dictionary/hashtabel are trying to avoid hashing collisions, LSH trys to maximize hashing collisions. Similar vectors will be grouped into same hash bucket.

In Faiss, `IndexLSH` is a Flat index with binary codes. Vectors are hashed into binary codes and compared by Hamming distances.

- `nbits` can be seen as the "resolution" of hashed vectors.

### Tradeoff

Increasing `nbits` can get higher fidelity with the cost of more memory and longer searching time.

LSH suffers the curse of dimensionality when using a larger `d`. In order to get similar search quality, the `nbits` value needs to be scaled up to maintain the search quality.

### Shortage

LSH speeds up searching time with a reasonable sacrifice of quality. But that only applies to small dimension `d`. Even 128 is already too large for LSH. Thus for vectors generated by transformer based embedding models, LSH index is not a common choice.

### Example


```python
nbits = d * 8

index = faiss.IndexLSH(d, nbits)
index.train(data)
index.add(data)
```


```python
D, I = index.search(data[:1], k)

print(f"closest elements: {I}")
print(f"distance: {D}")
```
