Abstract: Regression in the presence of outliers is an inherently combinatorial problem. However, compressive sensing theory suggests that certain combinatorial optimization problems can be exactly solved using polynomial-time algorithms. Motivated by this connection, several research groups have proposed polynomial-time algorithms for robust regression. In this paper we specifically address the traditional robust regression problem, where the number of observations is more than the number of unknown regression parameters and the structure of the regressor matrix is defined by the training dataset (and hence it may not satisfy properties such as Restricted Isometry Property or incoherence). We derive the precise conditions under which the sparse regularization ( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">0</sub> and <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -norm) approaches solve the robust regression problem. We show that the smallest principal angle between the regressor subspace and all <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">k</i> -dimensional outlier subspaces is the fundamental quantity that determines the performance of these algorithms. In terms of this angle we provide an estimate of the number of outliers the sparse regularization based approaches can handle. We then empirically evaluate the sparse ( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -norm) regularization approach against other traditional robust regression algorithms to identify accurate and efficient algorithms for high-dimensional regression problems.
0 Replies
Loading