Measures of Information Reflect Memorization Patterns

Rachit Bansal; Danish Pruthi; Yonatan Belinkov

Measures of Information Reflect Memorization Patterns

Rachit Bansal, Danish Pruthi, Yonatan Belinkov

Published: 31 Oct 2022, Last Modified: 14 Jan 2023NeurIPS 2022 AcceptReaders: Everyone

Keywords: OOD generalization, memorization, spurious correlations, challenge sets, evaluation, model selection, information

TL;DR: Notions of information organization across neural activations allow us to characterize memorization behaviour in neural networks

Abstract: Neural networks are known to exploit spurious artifacts (or shortcuts) that co-occur with a target label, exhibiting heuristic memorization. On the other hand, networks have been shown to memorize training examples, resulting in example-level memorization. These kinds of memorization impede generalization of networks beyond their training distributions. Detecting such memorization could be challenging, often requiring researchers to curate tailored test sets. In this work, we hypothesize—and subsequently show—that the diversity in the activation patterns of different neurons is reflective of model generalization and memorization. We quantify the diversity in the neural activations through information-theoretic measures and find support for our hypothesis in experiments spanning several natural language and vision tasks. Importantly, we discover that information organization points to the two forms of memorization, even for neural activations computed on unlabeled in-distribution examples. Lastly, we demonstrate the utility of our findings for the problem of model selection.

Supplementary Material: pdf

17 Replies

Loading