The Universal Approximation Power of Finite-Width Deep ReLU NetworksDownload PDF

27 Sep 2018 (modified: 21 Dec 2018)ICLR 2019 Conference Blind SubmissionReaders: Everyone
  • Abstract: We show that finite-width deep ReLU neural networks yield rate-distortion optimal approximation (Bölcskei et al., 2018) of a wide class of functions, including polynomials, windowed sinusoidal functions, one-dimensional oscillatory textures, and the Weierstrass function, a fractal function which is continuous but nowhere differentiable. Together with the recently established universal approximation result for affine function systems (Bölcskei et al., 2018), this demonstrates that deep neural networks approximate vastly different signal structures generated by the affine group, the Weyl-Heisenberg group, or through warping, and even certain fractals, all with approximation error decaying exponentially in the number of neurons. We also prove that in the approximation of sufficiently smooth functions finite-width deep networks require strictly fewer neurons than finite-depth wide networks.
  • Keywords: rate-distortion optimality, ReLU, deep learning, approximation theory, Weierstrass function
9 Replies