Abstract: Nyström method and low-rank linearized Support Vector Machines (SVMs) are two widely used methods for scaling up kernel SVMs, both of which need to sample part of columns of the kernel matrix to reduce the size. However, existing non-uniform sampling methods suffer from at least quadratic time complexity in the number of training data, limiting the scalability of kernel SVMs. In this paper, we propose a parallel sampling method called parallel column subset selection (PCSS) based on the divide-and-conquer strategy, which divides the kernel matrix into several small submatrices and then selects columns in parallel. We prove that PCSS has a (1+ $$\epsilon $$ ) relative-error upper bound with respect to the kernel matrix. Further, we present two approaches to scaling up kernel SVMs by combining PCSS with Nyström method and low-rank linearized SVMs. The results of comparison experiments demonstrate the effectiveness, efficiency and scalability of our approaches.
0 Replies
Loading