Three directories for
(1) Long range image classification
(2) Copy and selective copy
(3) Shakespeare LLM

The  core transformer code is based on karpathy's nanoGPT
Shakespeare data is not included here, but available at
https://github.com/karpathy/nanoGPT/blob/master/README.md
