## Fast Regression for Structured Inputs Sep 29, 2021 (edited Nov 18, 2021)ICLR 2022 Conference Blind SubmissionReaders: Everyone
• Keywords: regression, sublinear time algorithm, structured input
• Abstract: $L_p$ Regression on Structured Inputs is an important problem in data analysis and machine learning where we find a vector $$\mathbf{x}\in\mathbb R^{d}$$ that minimizes $$\|\mathbf{A}\mathbf{x}-\mathbf{b}\|_p$$ for a \textit{structured} matrix $$\mathbf{A}\in\mathbb R^{n \times d}$$ and response vector $$\mathbf{b}\in\mathbb R^{n}$$. Unfortunately, for many common classes of matrices, sampling-based algorithms for approximately solving $L_p$ regression require runtime that is exponential in $p$, e.g., $d^{\mathcal{O}(p)}$, which is prohibitively expensive. We show that for a large class of structured inputs, such as combinations of low-rank matrices, sparse matrices, and Vandermonde matrices, $L_p$ regression can be approximately solved using runtime that is polynomial in $p$. For example, we show that $L_p$ regression on Vandermonde matrices can be approximately solved using time $\mathcal{O}(T(\mathbf{A})\log n+(dp)^\omega\cdot\text{polylog}\,n)$, where $T(\mathbf{A})$ is the time to multiply $\mathbf{A}\mathbf{x}$ for an arbitrary vector $\mathbf{x}\in\mathbb{R}^d$, and $\omega$ is the exponent of matrix multiplication. The polynomial dependence on $p$ also crucially allows our algorithms to extend naturally to sublinear time algorithms for $L_\infty$ regression. Of independent interest, we develop a new algorithm for solving $L_p$ regression for arbitrary matrices, which is significantly faster in practice for every $p\ge4$.
• Supplementary Material: zip
3 Replies