Abstract: We study the learnability of sets in Ropf <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> under the Gaussian distribution, taking Gaussian surface area as the "complexity measure" of the sets being learned. Let C <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</sub> denote the class of all (measurable) sets with surface area at most S. We first show that the class C <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</sub> is learnable to any constant accuracy in time n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">O(S</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">)</sup> , even in the arbitrary noise ("agnostic'') model. Complementing this, we also show that any learning algorithm for C <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S</sub> information-theoretically requires 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Omega(S</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">)</sup> examples for learning to constant accuracy. These results together show that Gaussian surface area essentially characterizes the computational complexity of learning under the Gaussian distribution. Our approach yields several new learning results, including the following (all bounds are for learning to any constant accuracy): The class of all convex sets can be agnostically learned in time 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">O</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">~</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">(radicn)</sup> (and we prove a 2 <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Omega(radicn)</sup> lower bound for noise-free learning). This is the first subexponential time algorithm for learning general convex sets even in the noise-free (PAC) model. Intersections of k halfspaces can be agnostically learned in time n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">O(log</sup> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">k)</sup> (cf. Vempala's n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">O(k)</sup> time algorithm for learning in the noise-free model).Cones (with apex centered at the origin), and spheres witharbitrary radius and center, can be agnostically learned in time poly(n).
0 Replies
Loading