Learning Surrogate Losses

Josif Grabocka; Randolf Scholz; Lars Schmidt-Thieme

Learning Surrogate Losses

Josif Grabocka, Randolf Scholz, Lars Schmidt-Thieme

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: Surrogate losses, Non-differentiable losses

TL;DR: Optimizing Surrogate Loss Functions

Abstract: The minimization of loss functions is the heart and soul of Machine Learning. In this paper, we propose an off-the-shelf optimization approach that can seamlessly minimize virtually any non-differentiable and non-decomposable loss function (e.g. Miss-classification Rate, AUC, F1, Jaccard Index, Mathew Correlation Coefficient, etc.). Our strategy learns smooth relaxation versions of the true losses by approximating them through a surrogate neural network. The proposed loss networks are set-wise models which are invariant to the order of mini-batch instances. Ultimately, the surrogate losses are learned jointly with the prediction model via bilevel optimization. Empirical results on multiple datasets with diverse real-life loss functions compared with state-of-the-art baselines demonstrate the efficiency of learning surrogate losses.

Code: https://gofile.io/?c=uJD3QC

Original Pdf: pdf

14 Replies

Loading