Softmax Supervision with Isotropic Normalization


Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: The softmax function is widely used to train deep neural networks for multi-class classification. Despite its outstanding performance in classification tasks, the features derived from the supervision of softmax are usually sub-optimal in some scenarios where Euclidean distances apply in feature spaces. To address this issue, we propose a new loss, dubbed the isotropic loss, in the sense that the overall distribution of data points is regularized to approach the isotropic normal one. Combined with the vanilla softmax, we formalize a novel criterion called the isotropic softmax, or isomax for short, for supervised learning of deep neural networks. By virtue of the isomax, the intra-class features are penalized by the isotropic loss while inter-class distances are well kept by the original softmax loss. Moreover, the isomax loss does not require any additional modifications to the network, mini-batches or the training process. Extensive experiments on classification and clustering are performed to demonstrate the superiority and robustness of the isomax loss.
  • TL;DR: The discriminative capability of softmax for learning feature vectors of objects is effectively enhanced by virture of isotropic normalization on global distribution of data points.
  • Keywords: softmax, center loss, triplet loss, convolution neural network, supervised learning