Abstract: In the machine learning research community, it is generally believed that there is a tension between memorization and generalization. In this work we examine to what extent this tension exists, by exploring if it is possible to generalize through memorization alone. Although direct memorization with a lookup table obviously does not generalize, we find that introducing depth in the form of a network of support-limited lookup tables leads to generalization that is significantly above chance and closer to those obtained by standard learning algorithms on several tasks derived from MNIST and CIFAR-10. Furthermore, we demonstrate through a series of empirical results that our approach allows for a smooth tradeoff between memorization and generalization and exhibits some of the most salient characteristics of neural networks: depth improves performance; random data can be memorized and yet there is generalization on real data; and memorizing random data is harder in a certain sense than memorizing real data. The extreme simplicity of the algorithm and potential connections with stability provide important insights into the impact of depth on learning algorithms, and point to several interesting directions for future research.
TL;DR: It is possible to generalize a fair bit by memorizing alone; and this thought experiment leads to an interesting toy model.
Keywords: deep learning, network architecture, memorization, generalization error