On the Consistency of Spherical Z Loss

Abhishek Sharma

On the Consistency of Spherical Z Loss

Abhishek Sharma

12 Feb 2018 (modified: 05 May 2023)ICLR 2018 Workshop SubmissionReaders: Everyone

Abstract: Extremely large and sparse output space in a deep net classifier induces two major challenges of high computational complexity and class ambiguity. Class ambiguity is usually tackled by optimizing top-k error instead of zero one loss. To deal with computational complexity, recent work of ~\cite{Vincent2015EfficientEG} and ~\cite{Brbisson15} introduced a family of spherical loss that comes with a weight update algorithm that is independent of output space size. In this family, Z loss is of particular interest since it outperforms other spherical losses and log-softmax on top-k scores. However, there exists no theoretical result on the top-k calibration of Z loss or any concrete connection between top-k scores and hyper-parameters of Z loss. This paper provides insights on the relationship between the two and answers how and why hyper-parameters of Z loss are essential to optimize top-k scores.

Keywords: Spherical Z Loss, top-k calibration

4 Replies

Loading