TL;DR: We prove the first sets of SQ lower bounds for multiclass linear classification under random classification noise.
Abstract: We study the task of Multiclass Linear Classification (MLC)
in the distribution-free PAC model
with Random Classification Noise (RCN).
Specifically, the learner is given a set of
labeled examples $(x, y)$, where $x$ is drawn
from an unknown distribution on $R^d$
and the labels are generated by a
multiclass linear classifier corrupted with RCN.
That is, the label $y$ is flipped from $i$ to $j$
with probability $H_{ij}$
according to a known noise matrix $H$ with
non-negative separation
$\sigma: = \min_{i \neq j} H_{ii}-H_{ij}$.
The goal is to compute a hypothesis with
small 0-1 error. For the special case of two labels,
prior work has given polynomial-time algorithms
achieving the optimal error.
Surprisingly, little is known about
the complexity of this task even for three labels.
As our main contribution, we show that the complexity
of MLC with RCN becomes drastically different
in the presence of three or more labels.
Specifically, we prove super-polynomial
Statistical Query (SQ) lower bounds for this problem.
In more detail, even for three labels and
constant separation,
we give a super-polynomial lower bound
on the complexity of any SQ algorithm achieving optimal error.
For a larger number of labels and smaller separation,
we show a super-polynomial SQ lower bound even
for the weaker goal of achieving any constant factor approximation to the optimal loss or even beating the trivial hypothesis.
Lay Summary: We study the computational complexity of multi-class linear classification with random classification noise. We provide the first set of statistical query lower bounds for the problem, indicating that unlike binary linear classification, which can be efficiently learned, multi-class linear classification with random classification noise is hard in the distributional free setting.
Primary Area: Theory->Learning Theory
Keywords: Multiclass Linear Classification, Random Classification Noise, Statistical Query Learning
Submission Number: 13262
Loading