# Reproducibility

This codebase contains all scripts used for the experiments and plots in the paper "Mechanism and emergence of stacked attention heads in multi-layer transformers".

Most scripts require some parameter adjustmets to obtain the desired results.

The LLM experiments require API keys for the different LLMs. Obtaining access to and configuring the external APIs is not documented here.

The codebase contains the following scripts:

- `llms/03_gpt_retrieval_equations.py` is used to benchmark the LLMs on the equations formulation of the retrieval problem. Install third-party API connectors using `pip install google-generativeai openai anthropic`. Make sure to store the API keys in the `.env` file.
- `llms/04_gpt_retrieval.py` is used to benchmark GPT-4 on the other four formulations (lives-with, kingdoms, functions, relatives). Uncomment the desired
  formulation on lines 312-315.
- `llms/05_plots.py` is used to generate the plots using the data from the previuous two scripts.
- `min-gpt/data.py` contains the synthetic dataset for the minimal formulation of the retrieval problem.
- `min-gpt/model.py` contains the transformer model trained on the minimal formulation. This code is based on the min-gpt repository by Andrej Karpathy. The `forward` method contains additional parameters and functionality used for plotting and ablation.
- `min-gpt/train.py` is used to train a single transformer on the minimal formulation. Adjust the training parameters as desired.
- `min-gpt/mega_train.py` is used to train many transformers with various numbers of layers.
- `min-gpt/plot_mega_train.py` is used to plot the final validation accuracy of the transformers at the end of the mega train.
- `min-gpt/plot_losses.py` is used to plot the loss curves of a single transformer trained with `train.py`.
- `min-gpt/attn_viz.py` is used to visualize the attention maps of a single transformer.
- `min-gpt/attn_ablat.py` is used to compute the validation loss of a transformer after ablating the attention maps as described in the paper.
- `min-gpt/attn_emergence.py` is used to compute the average attention between certain input positions of a transformer during training.
- `min-gpt/plot_attn_curves.py` is used to visualize the average attention computed by the previous script. It also computes the first epoch when the average attention goes above a certain threshold.
