This folder contains the core code for the experiments in the paper. The following is a description of each folder.

- `attention-ranks`: Code for experiments on weight decay’s effect on the ranks of attention matrices in the model.
- `eval`: Evaluations of the solutions generated by the fine-tuned models. Includes the six metrics used to evaluate solution correctness and quality.
- `finetune`: Code used to perform supervised fine-tuning on pretrained models. We use the codebase accompanying the EvoLM paper (Qi et al., 2025) which in turn uses the llamafactory codebase. This folder contains the key scripts for finetuning.
- `linear-probing`: code for linear probing experiments examining weight decay’s effect on model representations 
- `pretrain`: code used to pretrain models. We use the codebase accompanying the EvoLM paper (Qi et al., 2025) which in turn uses the litgpt code based. This folder contains the key scripts for pretraining.