Abstract: For high-dimensional data, it is often desirable to group similar features together during the learning process. This can reduce the estimation variance and improve the stability of feature selection, leading to better generalization. Moreover, it can also help in understanding and interpreting data. Octagonal shrinkage and clustering algorithm for regression (OSCAR) is a recent sparse-modeling approach that uses a <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sub> -regularizer and a pairwise <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">l</i> <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">∞</sub> -regularizer on the feature coefficients to encourage such feature grouping. However, computationally, its optimization procedure is very expensive. In this paper, we propose an efficient solver based on the accelerated gradient method. We show that its key proximal step can be solved by a highly efficient simple iterative group merging algorithm. Given <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</i> input features, this reduces the empirical time complexity from <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">O</i> ( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</i> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ~ <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</i> <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">5</sup> ) for the existing solvers to just <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">O</i> ( <i xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">d</i> ). Experimental results on a number of toy and real-world datasets demonstrate that OSCAR is a competitive sparse-modeling approach, but with the added ability of automatic feature grouping.
0 Replies
Loading