- An updated manuscript highlighting the changes in blue (mamut-with-updates.pdf)
- math-mutator consists of the code to generate the mathematical datasets. 
- sympy-random-LaTex is the sympy fork required for this generation.
- transformer_pretraining is the mathematical pretraining framework
- transformer-math-evaluation is the framework used to evaluate the mathematical models through fine-tuning
- the NMFT dataset is math-mutator/data/data.json
- due to the large size of the generated datasets (several GBs each), these are not contained here, but (similar) datasets can be generated by following the description in math-mutator/ReadMe.md