Multi-Class Classification with Abstention Based on Crammer–Singer Surrogate with Linear Growth Rate
Keywords: learning with abstention, multi-class classification, learning theory
TL;DR: We propose a novel family of convex surrogate losses based on CS surrogate for multi-class classification with abstention in the predictor-rejector framework, which enjoys better optimization properties than those in previous work.
Abstract: We study the problem of multi-class classification with abstention, where a learner can choose to abstain from making a prediction to avoid excessively uncertain predictions. In this problem, the predictor-rejector framework is known as one promising approach, in which a predictor and rejector are learned separately, and the abstention cost is explicitly taken into account. However, only non-convex surrogate losses have been known previously due to their inherent difficulty in the multi-class setting. To tackle this difficulty, we propose a novel family of surrogate losses for multi-class classification with abstention based on the Crammer-Singer (CS) surrogate, which can constitute a convex loss that is easy to optimize. We show that the proposed surrogate losses lead to the optimal predictor and rejector, and prove excess error bounds for our surrogate losses, demonstrating a linear growth rate for a certain choice of losses.
Submission Number: 29
Loading