InfoMax-based Resampling for Dataset Balance and Diversity

InfoMax-based Resampling for Dataset Balance and Diversity

ICLR 2026 Conference Submission25154 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: InfoMax, mutual information, entropy maximization, weighted InfoNCE, change of measure, density-ratio estimation, dataset reweighting, balanced sampling

TL;DR: Learn sample weights via a mutual-information proxy for entropy to push data toward uniform coverage, using a consistent, low-variance weighted InfoNCE that yields plug-in weights for filtration and balanced sampling.

Abstract: We propose a principled reweighting framework that moves empirical data toward uniform coverage through implicit differential entropy maximization. The core idea replaces intractable entropy maximization with a mutual information proxy and derives variational estimators under change of measure, yielding a consistent, low-variance weighted InfoNCE. Learned weights are immediately usable for data filtration and imbalance-aware sampling.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 25154

Loading