Abstract: In the subspace approximation problem, we seek a k-dimensional subspace F of R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</sup> that minimizes the sum of p-th powers of Euclidean distances to a given set of n points a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> ,⋯, a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sub> ∈ R <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</sup> , for p ≥ 1. More generally than minimizing Σ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> dist(a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> F) <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sup> , we may wish to minimize Σ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> M(dist(a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> , F)) for some loss function M(), for example, M-Estimators, which include the Huber and Tukey loss functions. Such subspaces provide alternatives to the singular value decomposition (SVD), which is the p = 2 case, finding such an F that minimizes the sum of squares of distances. For p E [1, 2), and for typical M-Estimators, the minimizing F gives a solution that is more robust to outliers than that provided by the SVD. We give several algorithmic results for these robust subspace approximation problems. We state our results as follows, thinking of the n points as forming an n × d matrix A, and letting nnz(A) denote the number of non-zero entries of A. Our results hold for p ∈ [1, 2). We use poly(n) to denote n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">O(1)</sup> as n → ∞. 1) For minimizing Σ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> dist(a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> , F) <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sup> , we give an algorithm running in O(nnz(A) + (n + d)poly(k/ε) + exp(poly(k/ε))) 2) We show that the problem of minimizing Σ <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> dist(a <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">i</sub> , F) <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">p</sup> is NP-hard, even to output a (1 + 1/poly(d))-approximation. This extends work of Deshpande et al. (SODA, 2011) which could only show NP-hardness or UGC-hardness for p > 2; their proofs critically rely on p > 2. Our work resolves an open question of [Kannan Vempala, NOW, 2009]. Thus, there cannot be an algorithm running in time polynomial in k and 1/ε unless P = NP. Together with prior work, this implies that the problem is NP-hard for all p ≠ 2. 3) For loss functions for a wide class of M-Estimators, we give a problem-size reduction: for a parameter K = (log n)O(log k), our reduction takes O(nnz(A) logn + (n + d)poly(K/ε)) time to reduce the problem to a constrained version involving matrices whose dimensions are poly(Kε <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">-1</sup> log n). We also give bicriteria solutions. 4) Our techniques lead to the first O(mmz(A) + poly(d/ε)) time algorithms for (1 + ε)-approximate regression for a wide class of convex M-Estimators. This improves prior results [1], which were (1 + ε)-approximation for Huber regression only, and O(1)-approximation for a general class of M-Estimators.
0 Replies
Loading