# Hyperfitting Analysis: Beyond Temperature

**Paper Title**: "Beyond Temperature: Understanding How Hyperfitting Changes Token Rankings in LLMs"

This codebase implements experiments to analyze the hyperfitting phenomenon from the ICLR 2025 paper "The Hyperfitting Phenomenon: Sharpening and Stabilizing LLMs for Open-Ended Text Generation".

## Key Hypothesis

**Hyperfitting is NOT equivalent to temperature scaling**, despite both producing sharper probability distributions. Our experiments aim to prove that hyperfitting changes **which tokens are top-ranked**, not just **how probability is distributed among them**.

## Quick Start

### 1. Setup Environment

```bash
# Create conda environment (recommended)
conda create -n hyperfitting python=3.11 -y
conda activate hyperfitting

# Install dependencies
pip install -r requirements.txt
```

### 2. Run Full Pipeline

```bash
# Make scripts executable
chmod +x scripts/*.sh

# Run everything (hyperfitting + all experiments)
# Takes 2.5 hours on a 4090 for TinyLlama
./scripts/run_full_pipeline.sh
```

### 3. Or Run Step by Step

```bash
# Step 1: Hyperfit a model
python src/run_experiments.py \
    --mode hyperfit \
    --model TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T \
    --num_epochs 20 \
    --save_dir ./checkpoints/hyperfitted_tinyllama

# Step 2: Run experiments
python src/run_experiments.py \
    --mode experiments \
    --original_model TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T \
    --hyperfitted_model ./checkpoints/hyperfitted_tinyllama/final \
    --output_dir ./results
```

## Experiments

### Experiment 1: Temperature Matching

**Goal**: Prove that hyperfitting ≠ temperature scaling

**Method**:
1. Measure entropy of hyperfitted model's predictions
2. Find temperature T such that original model with T has matching entropy
3. Compare generation quality

**Expected Result**: Even with matched entropy, hyperfitted model produces better generations.

### Experiment 2: Rank Analysis

**Goal**: Show that hyperfitting changes which tokens are top-ranked

**Method**:
1. Compare top-k token rankings between original and hyperfitted models
2. Compute rank correlation
3. Identify tokens that get significantly promoted/demoted

**Expected Result**: Significant rank changes (not just probability reweighting).

### Experiment 3: Synthetic Hyperfitting

**Goal**: Can we achieve hyperfitting effects without training?

**Method**:
1. Learn which tokens are systematically promoted by hyperfitting
2. Create a simple logit adjustment function
3. Apply to original model and test generation quality

**Expected Result**: 
- If it works → hyperfitting is about rank modification
- If it doesn't → something deeper changes (representations)

### Experiment 4: Representation Analysis

**Goal**: Find where in the model hyperfitting causes changes

**Method**:
1. Compare hidden states layer by layer
2. Compute cosine similarity and effective dimensionality
3. Identify which layers change most

**Expected Result**: Identify critical layers for hyperfitting effect.

## Project Structure

```
hyperfitting_analysis/
├── src/
│   ├── config.py              # Configuration classes
│   ├── hyperfitting_trainer.py # Hyperfitting training code
│   ├── metrics.py             # Evaluation metrics
│   ├── experiments.py         # All 4 experiments
│   └── run_experiments.py     # Main entry point
│   └── visualize_results.py   # Visualize results
├── scripts/
│   ├── run_full_pipeline.sh   # Run everything
│   └── run_experiments_only.sh # Run experiments only
│   └── train_multi_model.sh   # Train more models for ablation studies
├── lora_experiments/          # LoRA experiments output directory
├── sf_experiments/             # SF experiments output directory
├── train_history/                     # Training history directory, loss json files are saved here
├── README.md/                        # README file, just a brief summary of the results
├── RESULTS/                        # Results markdown file, just a brief summary of the results
└── requirements.txt
```

## Hardware Requirements for SFT finetuning

| Model | VRAM Required | Estimated Time (hyperfitting) |
|-------|--------------|-------------------------------|
| TinyLlama 1.1B | 24GB | 2.5 hours |
| Qwen 2.5 1.5B | 96GB | 10 hours | 
| Llama 3.2 3B | 96GB | 10 hours |
| Gemma 2 2B | 96GB | 10 hours |

## Hardware Requirements for LoRA finetuning
| Model | VRAM Required | Estimated Time (hyperfitting) |
|-------|--------------|-------------------------------|
| TinyLlama 1.1B | 24GB | 2.5 hours |
| Qwen 2.5 1.5B(batch size 8) | 96GB | 10 hours |
| Llama 3.2 3B(batch size 8) | 96GB | 10 hours |
| Gemma 2 2B(batch size 8) | 96GB | 10 hours |
| Llama 3.1 8B(batch size 1) | 24GB(unsloth model) | 10 hours |
| DeepSeek 7B(batch size 1) | 24GB(unsloth model) | 10 hours |

## Key Metrics

- **TTR (Type-Token Ratio)**: Measures lexical diversity. Higher = less repetition.
- **Entropy**: Measures distribution sharpness. Lower = more confident predictions.
- **Rank Correlation**: Measures how similar token rankings are between models.
- **Top-1 Agreement**: How often both models predict the same top token.

## Expected Results

Based on our hypothesis, we expect to find:

1. **Experiment 1**: Hyperfitted model TTR >> Original model with matched temperature
2. **Experiment 2**: Top-1 agreement rate < 80% (significant rank changes)
3. **Experiment 3**: Synthetic corrections partially work but don't fully replicate hyperfitting
4. **Experiment 4**: Later layers show more change than earlier layers

## Command Reference

```bash
# Hyperfit a model
python src/run_experiments.py --mode hyperfit \
    --model MODEL_NAME \
    --num_samples 2000 \
    --num_epochs 20 \
    --learning_rate 1e-6

# Run all experiments
python src/run_experiments.py --mode experiments \
    --original_model ORIGINAL \
    --hyperfitted_model HYPERFITTED

# Run individual experiment
python src/run_experiments.py --mode experiment1 ...
python src/run_experiments.py --mode experiment2 ...
python src/run_experiments.py --mode experiment3 ...
python src/run_experiments.py --mode experiment4 ...
```

## Citation
**TODO: This is just a placeholder.**

If this code helps your research, please cite both:

```bibtex
@inproceedings{carlsson2025hyperfitting,
  title={The Hyperfitting Phenomenon: Sharpening and Stabilizing {LLM}s for Open-Ended Text Generation},
  author={Carlsson, Fredrik and Liu, Fangyu and Ward, Daniel and Kurfali, Murathan and Nivre, Joakim},
  booktitle={ICLR},
  year={2025}
}
```

## License

MIT License
