# Incremental Memorization experiment

We train models from the Pythia-family to memorize random birthyears associated with random names.
Each model is trained to memorize 128 sentences of the form `<name> was born in <year>`, where the names consist of 8 random lower-case characters and the birthyears are randomly sampled between 1800 and 1999.

We investigate how the models memorize the information by plotting the probability of the correct answer, the entropy of the distribution over birthyear answers and the perplexity of the whole sentence over the 50 training epochs.
Results are shown over the sentences that the model was trained to memorize (train), as well as a holdout set of sentences that are unseen during training, which have the same syntax but different names and birthyears (test).
