End-to-End Learnable Masks With Differentiable Indexing

Dibyanshu Shekhar; Sree Harsha Nelaturu; Ashwath Shetty; Ilia Sucholutsky

End-to-End Learnable Masks With Differentiable Indexing

Dibyanshu Shekhar, Sree Harsha Nelaturu, Ashwath Shetty, Ilia Sucholutsky

01 Mar 2023 (modified: 31 May 2023)Submitted to Tiny Papers @ ICLR 2023Readers: Everyone

Keywords: Top-K, Gaussian, Gumbel, Backpropagation

TL;DR: End to end optimization of masks via differentiable top-k relaxation

Abstract: An essential step towards developing efficient learning algorithms involves being able to work with as little data as possible to achieve good performance. For this reason, sparse representation learning is a crucial avenue of computer vision research. However, sparsity-inducing methods like importance sampling rely on non-differentiable operators like masking or top-K selection. While several tricks have been proposed for getting gradients to flow ‘through’ the pixels selected by the operators, the actual indices for which pixels are masked or selected are non-differentiable and thus cannot be learned end-to-end. We propose three methods for making operations like masking and top-k selection fully differentiable by allowing gradients to flow through the operator indices and showing how they can be optimized end-to-end using backpropagation. As a result, all three methods can be used as simple layers or submodules in existing neural network libraries.

8 Replies

Loading