Keywords: learning from label proportions, linear thresholds, algorithm, hardness
TL;DR: This work provides algorithmic and hardness results for properly learning linear thresholds from label proportions.
Abstract: We study the problem of properly learning linear threshold functions (LTFs) in the learning from label proportions (LLP) framework. In this, the learning is on a collection of bags of feature-vectors with only the proportion of labels available for each bag. First, we provide an algorithm that, given a collection of such bags each of size at most two whose label proportions are consistent with (i.e., the bags are satisfied by) an unknown LTF, efficiently produces an LTF that satisfies at least $(2/5)$-fraction of the bags. If all the bags are non-monochromatic (i.e., bags of size two with differently labeled feature-vectors) the algorithm satisfies at least $(1/2)$-fraction of them. For the special case of OR over the $d$-dimensional boolean vectors, we give an algorithm which computes an LTF achieving an additional $\Omega(1/d)$ in accuracy for the two cases. Our main result provides evidence that these algorithmic bounds cannot be significantly improved, even for learning monotone ORs using LTFs. We prove that it is NP-hard, given a collection of non-monochromatic bags which are all satisfied by some monotone OR, to compute any function of constantly many LTFs that satisfies $(1/2 + \varepsilon)$-fraction of the bags for any constant $\varepsilon > 0$. This bound is tight for the non-monochromatic bags case. The above is in contrast to the usual supervised learning setup (i.e., unit-sized bags) in which LTFs are efficiently learnable to arbitrary accuracy using linear programming, and even a trivial algorithm (any LTF or its complement) achieves an accuracy of $1/2$. These techniques however, fail in the LLP setting. Indeed, we show that the LLP learning of LTFs (even for the special case of monotone ORs) using LTFs dramatically increases in complexity as soon as bags of size two are allowed. Our work gives the first inapproximability for LLP learning LTFs, and a strong complexity separation between LLP and traditional supervised learning.
Supplementary Material: pdf
Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.