The Convex Information Bottleneck Lagrangian

Sep 25, 2019 ICLR 2020 Conference Withdrawn Submission readers: everyone Show Bibtex
  • Keywords: information bottleneck, representation learning
  • TL;DR: We introduce a general family of Lagrangians that allow exploring the IB curve in all scenarios. When these are used, and the IB curve is known, one can optimize directly for a performance/compression level directly.
  • Abstract: The information bottleneck (IB) problem tackles the issue of obtaining relevant compressed representations T of some random variable X for the task of predicting Y. It is defined as a constrained optimization problem which maximizes the information the representation has about the task, I(T;Y), while ensuring that a minimum level of compression r is achieved (i.e., I(X;T) <= r). For practical reasons the problem is usually solved by maximizing the IB Lagrangian for many values of the Lagrange multiplier, therefore drawing the IB curve (i.e., the curve of maximal I(T;Y) for a given I(X;Y)) and selecting the representation of desired predictability and compression. It is known when Y is a deterministic function of X, the IB curve cannot be explored and other Lagrangians have been proposed to tackle this problem (e.g., the squared IB Lagrangian). In this paper we (i) present a general family of Lagrangians which allow for the exploration of the IB curve in all scenarios; (ii) prove that if these Lagrangians are used, there is a one-to-one mapping between the Lagrange multiplier and the desired compression rate r for known IB curve shapes, hence, freeing from the burden of solving the optimization problem for many values of the Lagrange multiplier.
  • Code: https://gofile.io/?c=G9Dl1L
0 Replies

Loading