# Noise Stability in Transformer Models - Experimental Results

A comprehensive research framework for analyzing the sensitivity and robustness properties of transformer models across various tasks and architectures. This project investigates noise stability, grokking phenomena, total influence, and attention patterns in transformer models.

**Note: The Claude-4-sonnet LLM model (agent mode) was used to write this README file. The authors modified the file to ensure it was accurate and complete.**

## 📋 Table of Contents

- [Overview](#overview)
- [Project Structure](#project-structure)
- [Installation](#installation)
- [Experiments](#experiments)
  - [Noise Stability Analysis](#noise-stability-analysis)
  - [Boolean Functions Experiments](#boolean-functions-experiments)
  - [Grokking Experiments](#grokking-experiments)
  - [Sentiment Analysis Experiments](#sentiment-analysis-experiments)
  - [Total Influence Experiments](#total-influence-experiments)
  - [Performance Benchmarking](#performance-benchmarking)
- [Model Architecture](#model-architecture)
- [Visualization Tools](#visualization-tools)
- [Results and Figures](#results-and-figures)
- [Contributing](#contributing)

## 🔍 Overview

This project provides a comprehensive suite of tools and experiments for analyzing transformer model behavior, with particular focus on:

- **Noise Stability**: Measuring how sensitive models are to input perturbations
- **Grokking Phenomena**: Investigating sudden generalization improvements during training
- **Total Influence**: Analyzing attention head contributions across different model architectures
- **Boolean Function Learning**: Understanding how transformers learn structured logical functions

## 📁 Project Structure

```
Transformers_Sensitivity/
├── experiments_with_boolean_functions/    # Boolean function learning experiments
├── grokking_experiments/                  # Grokking phenomenon studies
├── model/                                 # Custom transformer implementations
├── noise_stability/                       # Noise stability measurement tools
├── pos_tagging_experiment/               # Part-of-speech tagging experiments
├── sentiment_experiment/                 # Sentiment analysis experiments
├── total_influence_experiments/          # Attention influence analysis
├── perf/                                 # Performance benchmarking tools
├── utilities/                            # Data generation and visualization utilities
├── visualization/                        # Plotting and visualization tools
├── figures/                              # Generated experiment figures
├── selected_results/                     # Saved experimental results
└── random_noise_stability_experiment.ipynb  # Jupyter notebook experiments
```

## 🛠 Installation

### Prerequisites

- Python 3.8+
- CUDA-compatible GPU (recommended)
- Git

### Basic Installation

1. **Clone the repository:**
```bash
git clone <repository-url>
cd Transformers_Sensitivity
```

2. **Create a virtual environment:**
```bash
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
```

3. **Install core dependencies:**
```bash
pip install -r requirements.txt
```

### Additional Dependencies for Specific Experiments

**For GPU acceleration:**
```bash
# Ensure you have the correct CUDA version for torch==2.5.1+cu121
pip install torch==2.5.1+cu121 --index-url https://download.pytorch.org/whl/cu121
```

## 🧪 Experiments

### Noise Stability Analysis

Measures how sensitive transformer models are to input perturbations using the noise stability metric: `E[f(X) * f(Y)] / Var[f(X)]`.

**Key Files:**
- `noise_stability/measure_noise_stability.py`

**Usage:**
```python
from noise_stability.measure_noise_stability import measure_noise_stability

# Measure noise stability for a model
stability = measure_noise_stability(
    model=your_model,
    n=50,  # sequence length
    r=0.8,  # correlation coefficient
    vocab_size=2,
    num_trials=100,
    device='cuda'
)
```

### Boolean Functions Experiments

Investigates how transformers learn structured boolean functions like parity and addition.

**Key Files:**
- `experiments_with_boolean_functions/boolean_experiment.py`
- `experiments_with_boolean_functions/boolean_functions.py`

**Usage:**
```bash
# Train on parity function
python experiments_with_boolean_functions/boolean_experiment.py \
    --function parity \
    --n 50 \
    --epochs 100 \
    --lr 0.001 \
    --d 64 \
    --layers 4 \
    --heads 8

# Train on other boolean functions
python experiments_with_boolean_functions/boolean_experiment.py \
    --function majority \
    --n 25 \
    --epochs 50
```

**Available Functions:**
- `parity`: XOR of all input bits
- `majority`: Returns 1 if majority of bits are 1
- `and`: Logical AND of all bits
- `or`: Logical OR of all bits

### Grokking Experiments

Studies the grokking phenomenon where models suddenly improve generalization after extended training.

**Key Files:**
- `grokking_experiments/grokking_experiment.py`
- `grokking_experiments/addition_function.py`

**Usage:**
```bash
# Run grokking experiment on modular addition
python grokking_experiments/grokking_experiment.py \
    --epochs 1000 \
    --lr 0.001 \
    --d 128 \
    --layers 2 \
    --heads 4 \
    --p 97  # modular arithmetic base

# Multi-GPU training
python grokking_experiments/grokking_experiment.py \
    --use_ddp \
    --gpu_ids 0,1,2,3
```

### Sentiment Analysis Experiments

Analyzes transformer behavior on sentiment classification using the Stanford Sentiment Treebank (SST-2).

**Key Files:**
- `sentiment_experiment/sentiment_experiment.py`

**Usage:**
```bash
# Run sentiment analysis experiment
python sentiment_experiment/sentiment_experiment.py \
    --epochs 50 \
    --lr 0.0001 \
    --batch_size 32 \
    --d 128 \
    --layers 6 \
    --heads 8 \
    --max_length 128
```

### Total Influence Experiments

Analyzes the influence of individual attention heads across different pre-trained model architectures.

**Key Files:**
- `total_influence_experiments/influence.py`
- `total_influence_experiments/gpt2.py`
- `total_influence_experiments/bert.py`
- `total_influence_experiments/roberta.py`
- `total_influence_experiments/gemma.py`

**Usage:**
```bash
# Analyze GPT-2 attention influence
python total_influence_experiments/influence.py \
    --model gpt2 \
    --n 50 \
    --n_samples 1000 \
    --norm l2 \
    --sampling uniform

# Analyze BERT influence
python total_influence_experiments/influence.py \
    --model bert \
    --n 100 \
    --n_samples 500 \
    --norm l1

# Compare multiple models
python total_influence_experiments/influence.py \
    --model roberta \
    --verbose
```

**Supported Models:**
- `gpt2`: GPT-2 language model
- `bert`: BERT masked language model
- `roberta`: RoBERTa model
- `gemma`: Gemma model

### Performance Benchmarking

Comprehensive benchmarking tool for multi-GPU training strategies.

**Key Files:**
- `perf/performance_benchmark.py`

**Usage:**
```bash
# Benchmark single vs multi-GPU performance
python perf/performance_benchmark.py \
    --strategies single,dp,ddp \
    --model_sizes small,medium,large \
    --batch_sizes 32,64,128 \
    --gpu_ids 0,1,2,3
```

## 🏗 Model Architecture

The project includes a custom transformer implementation (`model/transformer.py`) with:

- Multi-head attention mechanisms
- Positional encoding (fixed and randomized)
- Layer normalization and residual connections
- Optional MLP layers
- Support for multi-GPU training (DataParallel and DistributedDataParallel)
- Configurable architecture parameters

**Key Features:**
- Binary and multi-class classification support
- Attention weight extraction and visualization
- Noise stability integration
- Performance monitoring and logging

## 📊 Visualization Tools

The project includes comprehensive visualization utilities:

**Key Files:**
- `visualization/plotting.py`: General plotting utilities
- `visualization/attention_matrix_visualization.py`: Attention pattern visualization
- `utilities/create_plots.py`: Automated plot generation
- `utilities/visualize_influence.py`: Influence visualization

**Generated Visualizations:**
- Training loss and accuracy curves
- Noise stability measurements over training
- Attention pattern heatmaps
- Influence distribution plots
- Performance comparison charts

## 📈 Results and Figures

All experimental results are automatically saved to:
- `figures/`: Generated plots and visualizations
- `selected_results/`: Detailed experimental logs and model checkpoints

**Naming Convention:**
- Figures: `{metric}_{experiment}_{timestamp}.png`
- Results: `{timestamp}_{model}_{configuration}/`

## 🚀 Quick Start Examples

**1. Run a basic boolean function experiment:**
```bash
python experiments_with_boolean_functions/boolean_experiment.py --function parity --epochs 50
```

**2. Analyze noise stability:**
```python
from noise_stability.measure_noise_stability import measure_noise_stability
# Use with your trained model
stability = measure_noise_stability(model, n=50, r=0.8, vocab_size=2)
```

**3. Benchmark performance:**
```bash
python perf/performance_benchmark.py --strategies single,dp --model_sizes small,medium
```

**4. Explore in Jupyter:**
```bash
jupyter notebook random_noise_stability_experiment.ipynb
```

## 🛠 Utilities

The project provides several utility modules:

- **Data Generation** (`utilities/data_generation.py`): Generate synthetic datasets
- **Logging** (`utilities/logger.py`): Comprehensive experiment logging
- **Plotting** (`utilities/create_plots.py`): Automated visualization generation
- **Attention Analysis** (`utilities/read_attention_matrices.py`): Extract and analyze attention weights

## 🔧 Configuration

Most experiments support extensive command-line configuration:

**Common Parameters:**
- `--epochs`: Number of training epochs
- `--lr`: Learning rate
- `--d`: Model embedding dimension
- `--layers`: Number of transformer layers
- `--heads`: Number of attention heads
- `--batch_size`: Training batch size
- `--device`: Device selection (cpu/cuda)

**Multi-GPU Options:**
- `--use_dp`: Enable DataParallel
- `--use_ddp`: Enable DistributedDataParallel
- `--gpu_ids`: Specify GPU IDs (e.g., "0,1,2,3")

## 📄 License

This project is available for academic and research purposes. Please cite appropriately if you use this code in your research.

## 🤝 Acknowledgments

This research framework builds upon various open-source libraries and research contributions in the transformer and deep learning community.
