Here's a cleanly formatted, ready-to-use `README.md` file for your project:

---

# 🚀 FusedANN: Geometric Transformation for Efficient Filtered Vector Search

This repository provides a complete Python implementation for the paper:

**"FusedANN: Geometric Transformation for Efficient Filtered Vector Search"**  

---

## 📖 Overview

**FusedANN** is a geometric framework that seamlessly integrates attribute filtering with vector similarity search, enabling efficient and scalable hybrid queries. This implementation includes:

- **Single and Multi-Attribute Filtering**
- **Hierarchical Attribute Priority**
- **Efficient Range Filtering (3-level hierarchical indexing)**
- **Extensive Experimentation and Ablation Studies**

---

## 📂 Repository Structure

```shell
FusedANN/
├── datasets/
│   └── dataset_loader.py
├── models/
│   ├── base_ann.py
│   ├── fused_ann.py
│   └── range_ann.py
├── api/
│   ├── build_index.py
│   └── query_index.py
├── utils/
│   ├── transform.py
│   └── indexing.py
├── experiments/
│   ├── exp_single_attribute.py
│   ├── exp_multi_attribute.py
│   ├── exp_range_filter.py
│   ├── exp_ablation.py
│   └── exp_scalability.py
├── main.py
├── requirements.txt
└── README.md
```

---

## ⚙️ Installation Guide

### Step 1: Clone the Repository

```bash
git clone <your_repo_url>
cd FusedANN
```

### Step 2: Setup Python Environment

We recommend using a virtual environment:

```bash
python -m venv env
source env/bin/activate  # Linux/Mac
.\env\Scripts\activate   # Windows
```

### Step 3: Install Dependencies

```bash
pip install --upgrade pip
pip install -r requirements.txt
```

---

## 📋 Requirements

- Python 3.7+
- numpy
- scipy
- scikit-learn
- hnswlib
- faiss-cpu

*(All dependencies included in `requirements.txt`.)*

---

## 📥 Dataset Preparation

Store your datasets under the `/datasets` folder. Datasets must be in `.npz` format containing two arrays:

- `vectors`: shape `(N, d)` (pre-embedded vectors)
- `attributes`: shape `(N, m)` (pre-embedded attributes)

Example to create dataset file:

```python
import numpy as np

vectors = np.random.rand(10000, 128)
attributes = np.random.rand(10000, 32)
np.savez("datasets/SIFT1M.npz", vectors=vectors, attributes=attributes)
```

---

## 🚧 Running Experiments

Navigate to `/experiments` and execute experiments individually:

```bash
# Single Attribute Filtering
python experiments/exp_single_attribute.py

# Multi-Attribute Filtering
python experiments/exp_multi_attribute.py

# Range Filtering (3-level indexing described in paper)
python experiments/exp_range_filter.py

# Ablation Studies
python experiments/exp_ablation.py

# Scalability Analysis
python experiments/exp_scalability.py
```

---

## 🧪 Experiment Results

After running, you should see outputs like:

```shell
Single Attribute Filtering - Avg Recall@10: 0.95
Single Attribute Filtering - Avg Query Time (ms): 1.23

Multi-Attribute Filtering - Avg Recall@10: 0.92
Multi-Attribute Filtering - Avg Query Time (ms): 1.45

Range Filtering - Avg Recall@10: 0.90
Range Filtering - Avg Query Time (ms): 2.12

Scalability - Dataset size 10000: Index Building Time: 0.12s
Scalability - Dataset size 100000: Index Building Time: 1.45s
```

---

## 📂 Code Structure Explained

- `/models`: Contains ANN indexing implementations (`fused_ann.py`, `range_ann.py`) and base class (`base_ann.py`).
- `/api`: APIs for building indexes and querying.
- `/datasets`: Dataset loading utilities.
- `/utils`: Transformations and indexing utilities.
- `/experiments`: Reproducible experiments from the paper.

---

## 🚀 Three-Level Hierarchical Indexing for Range Filtering

The implementation clearly supports the three-level hierarchical indexing described in the paper:

1. **Adaptive Line Sampling**: `experiments/exp_range_filter.py`
2. **Hierarchical Line Similarity Indexing**: `utils/indexing.py`
3. **Cylindrical Distance Indexing**: `models/range_ann.py`

---

## 🛠️ Customizing Experiments

Adjust hyperparameters (e.g., `alpha`, `beta`, dataset sizes) within experiment scripts to explore various scenarios and reproduce paper results.

---

## 🚨 Performance Recommendation

For large-scale applications, we strongly recommend replacing the simple ANN implementation with optimized libraries such as:

- [FAISS](https://github.com/facebookresearch/faiss)
- [HNSWlib](https://github.com/nmslib/hnswlib)

These libraries significantly enhance indexing and querying performance.


---

**🎉 Happy Searching!**