Improved Mutual Information Estimation

Youssef Mroueh*; Igor Melnyk*; Pierre Dognin*; Jerret Ross*; Tom Sercu*

Improved Mutual Information Estimation

Youssef Mroueh, Igor Melnyk, Pierre Dognin, Jerret Ross, Tom Sercu*

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: mutual information, variational bound, kernel methods, Neural estimators, mutual information maximization, self-supervised learning

TL;DR: we propose a new variational bound for estimating mutual information and show the strength of our estimator in large-scale self-supervised representation learning through MI maximization.

Abstract: We propose a new variational lower bound on the KL divergence and show that the Mutual Information (MI) can be estimated by maximizing this bound using a witness function on a hypothesis function class and an auxiliary scalar variable. If the function class is in a Reproducing Kernel Hilbert Space (RKHS), this leads to a jointly convex problem. We analyze the bound by deriving its dual formulation and show its connection to a likelihood ratio estimation problem. We show that the auxiliary variable introduced in our variational form plays the role of a Lagrange multiplier that enforces a normalization constraint on the likelihood ratio. By extending the function space to neural networks, we propose an efficient neural MI estimator, and validate its performance on synthetic examples, showing advantage over the existing baselines. We then demonstrate the strength of our estimator in large-scale self-supervised representation learning through MI maximization.

Original Pdf: pdf

12 Replies

Loading

Improved Mutual Information Estimation

Youssef Mroueh*, Igor Melnyk*, Pierre Dognin*, Jerret Ross*, Tom Sercu*

Youssef Mroueh, Igor Melnyk, Pierre Dognin, Jerret Ross, Tom Sercu*