Estimating informativeness of samples with Smooth Unique Information

Hrayr Harutyunyan; Alessandro Achille; Giovanni Paolini; Orchid Majumder; Avinash Ravichandran; Rahul Bhotika; Stefano Soatto

Estimating informativeness of samples with Smooth Unique Information

Hrayr Harutyunyan, Alessandro Achille, Giovanni Paolini, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto

Published: 12 Jan 2021, Last Modified: 22 Jun 2025ICLR 2021 PosterReaders: Everyone

Keywords: sample information, information theory, stability theory, ntk, dataset summarization

Abstract: We define a notion of information that an individual sample provides to the training of a neural network, and we specialize it to measure both how much a sample informs the final weights and how much it informs the function computed by the weights. Though related, we show that these quantities have a qualitatively different behavior. We give efficient approximations of these quantities using a linearized network and demonstrate empirically that the approximation is accurate for real-world architectures, such as pre-trained ResNets. We apply these measures to several problems, such as dataset summarization, analysis of under-sampled classes, comparison of informativeness of different data sources, and detection of adversarial and corrupted examples. Our work generalizes existing frameworks, but enjoys better computational properties for heavily over-parametrized models, which makes it possible to apply it to real-world networks.

One-sentence Summary: We define, both in weight-space and function-space, a notion of unique information that an individual sample provides to the training of a deep network and show how to compute it efficiently for large networks using a linearization of the model.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Code: [![github](/images/github_icon.svg) awslabs/aws-cv-unique-information](https://github.com/awslabs/aws-cv-unique-information)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/estimating-informativeness-of-samples-with/code)

20 Replies

Loading