Abstract: We argue that the estimation of mutual information between high dimensional continuous random variables can be achieved by gradient descent
over neural networks. We present a Mutual Information Neural Estimator (MINE) that is linearly
scalable in dimensionality as well as in sample
size, trainable through back-prop, and strongly
consistent. We present a handful of applications
on which MINE can be used to minimize or maximize mutual information. We apply MINE to improve adversarially trained generative models. We
also use MINE to implement the Information Bottleneck, applying it to supervised classification;
our results demonstrate substantial improvement
in flexibility and performance in these settings.
0 Replies
Loading