Learning Latent Permutations with Gumbel-Sinkhorn Networks

Gonzalo Mena; David Belanger; Scott Linderman; Jasper Snoek

Learning Latent Permutations with Gumbel-Sinkhorn Networks

Gonzalo Mena, David Belanger, Scott Linderman, Jasper Snoek

15 Feb 2018 (modified: 22 Jun 2025)ICLR 2018 Conference Blind SubmissionReaders: Everyone

Abstract: Permutations and matchings are core building blocks in a variety of latent variable models, as they allow us to align, canonicalize, and sort data. Learning in such models is difficult, however, because exact marginalization over these combinatorial objects is intractable. In response, this paper introduces a collection of new methods for end-to-end learning in such models that approximate discrete maximum-weight matching using the continuous Sinkhorn operator. Sinkhorn iteration is attractive because it functions as a simple, easy-to-implement analog of the softmax operator. With this, we can define the Gumbel-Sinkhorn method, an extension of the Gumbel-Softmax method (Jang et al. 2016, Maddison2016 et al. 2016) to distributions over latent matchings. We demonstrate the effectiveness of our method by outperforming competitive baselines on a range of qualitatively different tasks: sorting numbers, solving jigsaw puzzles, and identifying neural signals in worms.

TL;DR: A new method for gradient-descent inference of permutations, with applications to latent matching inference and supervised learning of permutations with neural networks

Keywords: Permutation, Latent, Sinkhorn, Inference, Optimal Transport, Gumbel, Softmax, Sorting

Code: [![github](/images/github_icon.svg) google/gumbel_sinkhorn](https://github.com/google/gumbel_sinkhorn) + [![Papers with Code](/images/pwc_icon.svg) 1 community implementation](https://paperswithcode.com/paper/?openreview=Byt3oJ-0W)

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/learning-latent-permutations-with-gumbel/code)

10 Replies

Loading