Code for the LLM experiment in the paper.
Code for the other experiments is similar though simpler.
#############################

Occam gradient descent for shakespeare LLM
based on karpathy's nanoGPT
does not include datafiles, please see link below
https://github.com/karpathy/nanoGPT/blob/master/README.md

