We provide 6 python scripts used for our paper:

- `2_train_interpret.py` for training and visualizing a standard attention-only transformer (section 2)
- `3_weights_visualization.py` for visualizing the model weights matrices during training (Section 3)
- `4_h1.py` for testing hypothesis 1 (section 4)
- `4_h2.py` for testing hypothesis 2 (section 4)
- `5_emergence_time_by_N.py` for plotting the emergence times vs N (section 5)
- `B_math_checker.py` for verifying our derivations in Appendix B
