Fast Regression for Structured InputsDownload PDF


Sep 29, 2021 (edited Nov 18, 2021)ICLR 2022 Conference Blind SubmissionReaders: Everyone
  • Keywords: regression, sublinear time algorithm, structured input
  • Abstract: $L_p$ Regression on Structured Inputs is an important problem in data analysis and machine learning where we find a vector \(\mathbf{x}\in\mathbb R^{d}\) that minimizes \(\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_p\) for a \textit{structured} matrix \(\mathbf{A}\in\mathbb R^{n \times d}\) and response vector \(\mathbf{b}\in\mathbb R^{n}\). Unfortunately, for many common classes of matrices, sampling-based algorithms for approximately solving $L_p$ regression require runtime that is exponential in $p$, e.g., $d^{\mathcal{O}(p)}$, which is prohibitively expensive. We show that for a large class of structured inputs, such as combinations of low-rank matrices, sparse matrices, and Vandermonde matrices, $L_p$ regression can be approximately solved using runtime that is polynomial in $p$. For example, we show that $L_p$ regression on Vandermonde matrices can be approximately solved using time $\mathcal{O}(T(\mathbf{A})\log n+(dp)^\omega\cdot\text{polylog}\,n)$, where $T(\mathbf{A})$ is the time to multiply $\mathbf{A}\mathbf{x}$ for an arbitrary vector $\mathbf{x}\in\mathbb{R}^d$, and $\omega$ is the exponent of matrix multiplication. The polynomial dependence on $p$ also crucially allows our algorithms to extend naturally to sublinear time algorithms for $L_\infty$ regression. Of independent interest, we develop a new algorithm for solving $L_p$ regression for arbitrary matrices, which is significantly faster in practice for every $p\ge4$.
  • Supplementary Material: zip
3 Replies