# Evidence of 'Catch, Tag, and Release' Mechanism through PCA Visualization

This guide walks through the process of extracting features, generating PCA-based token visualizations, and viewing the results in a web-based interface.

---

## Step 1: Extract Features from Phi-3-Medium  
Run the following script to extract embeddings and attention sinks:

```bash
python extract_phi3_medium_features.py
```
Notes Before Running:
- Ensure you have Phi-3-Medium model weights and a GPU (e.g., A40 or similar) capable of loading the model.
- This script must be run first to extract features and attention sinks.
- Update the path to the Phi-3-Medium model weights at Line 11.
- Update the prompt between Lines 24-50 to match your desired input.
- Set the base_output_dir on Line 56 to define where extracted features will be saved.
What This Does:
- This file extracts features from the Phi-3-Medium model based on a prompt passed into the model.
- Additionally, attention sinks are saved as images.

## Step 2: Generate PCA-Based Token Color Visualizations
Once features have been extracted, run:
```bash
python color_tokens.py
```
Notes Before Running:
- Update the path to the Phi-3-Medium model weights at Line 11.
- The expectation is that you have extracted features using Step 1 before running this file.
What This Does:
- Computes PCA projections of the extracted features.
- Colors each token based on PCA components.
- Generates .json files mapping tokens to their corresponding colors.

## Step 3: View the Visualization Locally
1. Copy the generated folder containing .json files and attention sinks into a local folder.
2. Inside this folder, create a subfolder called prompts/ (any subfolder here will be recognized by index.html).
3. To view the visualizations, navigate to the directory containing index.html and the prompts/ folder, then run:
```bash
python3 -m http.server 8002
```
4. Open `http://localhost:8002` in your browser to explore the visualizations.

## Example Folder Structure:
```
/local_folder/
 ├── index.html
 ├── prompts/
 │   ├── experiment_1/
 │   │   ├── extracted_features
 │   │   ├── attention_sinks
 │   ├── experiment_2/
 │   │   ├── ...
 │   │   ├── ...
```
This setup enables easy switching between different prompts and experiments via the web interface.
