Two Regimes of Generalization for Non-Linear Metric LearningDownload PDF

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: metric learning, guarantees, sparsity
Abstract: A common approach to metric learning is to seek an embedding of the input data that behaves well with respect to the labels. While generalization bounds for linear embeddings are known, the non-linear case is not well understood. In this work we fill this gap by providing uniform generalization guarantees for the case where the metric is induced by a neural network type embedding of the data. Specifically, we discover and analyze two regimes of behavior of the networks, which are roughly related to the sparsity of the last layer. The bounds corresponding to the first regime are based on the spectral and $(2,1)$-norms of the weight matrices, while the second regime bounds use the $(2,\infty)$-norm at the last layer, and are significantly stronger when the last layer is dense. In addition, we empirically evaluate the behavior of the bounds for networks trained with SGD on the MNIST and 20newsgroups datasets. In particular, we demonstrate that both regimes occur naturally on realistic data.
One-sentence Summary: Different types of guarantees for metric learning with neural networks.
Supplementary Material: zip
13 Replies

Loading