Minimum Width for Universal ApproximationDownload PDF

Sep 28, 2020 (edited Mar 17, 2021)ICLR 2021 SpotlightReaders: Everyone
  • Keywords: universal approximation, neural networks
  • Abstract: The universal approximation property of width-bounded networks has been studied as a dual of classical universal approximation results on depth-bounded networks. However, the critical width enabling the universal approximation has not been exactly characterized in terms of the input dimension $d_x$ and the output dimension $d_y$. In this work, we provide the first definitive result in this direction for networks using the ReLU activation functions: The minimum width required for the universal approximation of the $L^p$ functions is exactly $\max\{d_x+1,d_y\}$. We also prove that the same conclusion does not hold for the uniform approximation with ReLU, but does hold with an additional threshold activation function. Our proof technique can be also used to derive a tighter upper bound on the minimum width required for the universal approximation using networks with general activation functions.
  • Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
  • One-sentence Summary: We establish the tight bound on width for the universal approximability of neural network.
10 Replies