# Recursive Stochastic Neighbor Embedding (RSNE)

This repository contains code for **Recursive Stochastic Neighbor Embedding (RSNE)**, including both **Bi-RSNE** (batch incremental) and **i-RSNE** (point-wise incremental) variants, applied to the **CIFAR-10** and **climate weather** datasets. It also includes DINOv2-based feature extraction and baseline Barnes–Hut t-SNE evaluation.

## Included Scripts

- `Bi-RSNE.py`  
  Batch-incremental RSNE on pre-extracted features.

- `i-RSNE.py`  
  Point-wise incremental RSNE for real-time streaming.

- `Bi-RSNE_climate.py`  
  RSNE directly on the raw weather dataset using monthly updates.

- `feature_extraction_cifar10.py`  
  Extracts DINOv2 ViT-L/14 features from CIFAR-10 images.

- `original_bh_tsne.py`  
  Runs standard Barnes–Hut t-SNE on saved feature vectors.

## Data File

- `weatherHistory.csv`  
  Raw climate dataset used in `Bi-RSNE_climate.py`.

---

## 1. Feature Extraction (CIFAR-10)

**Script:** `feature_extraction_cifar10.py`

1. Downloads and extracts features for all 50,000 CIFAR-10 training images using DINOv2 ViT-L/14.
2. Requires internet access to fetch dataset via `torchvision`.

**Usage**
```bash
python feature_extraction_cifar10.py --output_prefix cifar10
```

**Output**
- `cifar10_train_features.npy` (shape: 50000 × 1024)
- `cifar10_train_labels.npy` (shape: 50000,)

---

## 2. RSNE Variants

### Bi-RSNE (Batch Incremental)

**Script:** `Bi-RSNE.py`

Runs batch incremental RSNE on saved features.

**Usage**
```bash
python Bi-RSNE.py \
  --X_path cifar10_train_features.npy \
  --y_path cifar10_train_labels.npy \
  --split_ratio 0.5 \
  --K 200 \
  --batch_size 100
```

---

### i-RSNE (Point-wise Incremental)

**Script:** `i-RSNE.py`

Performs fully incremental RSNE on individual samples.

**Usage**
```bash
python i-RSNE.py \
  --X_path cifar10_train_features.npy \
  --y_path cifar10_train_labels.npy \
  --split_ratio 0.5 \
  --K 1000 \
  --eta 10.0
```

---

## 3. Baseline: Barnes-Hut t-SNE

**Script:** `original_bh_tsne.py`

Runs Barnes-Hut t-SNE on full dataset.

**Usage**
```bash
python original_bh_tsne.py \
  --X_path cifar10_train_features.npy \
  --y_path cifar10_train_labels.npy
```

---

## 4. Climate Dataset Variant

**Script:** `Bi-RSNE_climate.py`

Runs RSNE on the weather dataset with engineered features.

**Usage**
```bash
python Bi-RSNE_climate.py \
  --csv_path /path/to/weatherHistory.csv \
  --K 200
```

