Abstract: In the machine learning research community, it is generally believed that
there is a tension between memorization and generalization. In this work we
examine to what extent this tension exists, by exploring if it is
possible to generalize through memorization alone. Although direct memorization
with a lookup table obviously does not generalize, we find that introducing
depth in the form of a network of support-limited lookup tables leads to
generalization that is significantly above chance and closer to those
obtained by standard learning algorithms on several tasks derived from MNIST
and CIFAR-10. Furthermore, we demonstrate through a series of
empirical results that our approach allows for a smooth tradeoff between
memorization and generalization and exhibits some of the most salient
characteristics of neural networks: depth improves performance; random data
can be memorized and yet there is generalization on real data; and
memorizing random data is harder in a certain sense than memorizing real
data. The extreme simplicity of the algorithm and potential connections
with stability provide important insights into the impact of depth on
learning algorithms, and point to several interesting directions for future
research.
TL;DR: It is possible to generalize a fair bit by memorizing alone; and this thought experiment leads to an interesting toy model.
Keywords: deep learning, network architecture, memorization, generalization error
4 Replies
Loading