Characterizing ResNet's Universal Approximation Capability

Chenghao Liu; Enming Liang; Minghua Chen

Characterizing ResNet's Universal Approximation Capability

Chenghao Liu, Enming Liang, Minghua Chen

22 Sept 2023 (modified: 24 Nov 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: learning theory

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: ResNet, approximation complexity

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Since its debut in 2016, ResNet has become arguably the most favorable architecture in deep neural network (DNN) design. It effectively addresses the gradient vanishing issue in DNN training, allowing engineers to fully unleash DNN's potential in tackling challenging problems in various domains. Despite its practical success, an essential theoretical question remains largely open: how well can ResNet approximate functions? In this paper, we show that ResNet with bottleneck blocks (b-ResNet) can approximate any $d$-dimensional monomial with degree $p$ to any accuracy $\epsilon>0$ with $\mathcal{O}(dp\log (p/\epsilon))$ number of weights and we extend the results to polynomials, smooth functions, continuous functions. This is a factor of $d$ reduction in the number of training weights compared with the classical results for ReLU feedforward networks. Our results reveal that a continuous-depth network generated via a dynamical system possesses significant approximation capabilities even if its dynamics function is realized by a shallow ReLU network with absolute constant neurons. Furthermore, our achievability result is order-optimal in terms of $\epsilon$ as it matches the generalized lower bound. Besides, we apply ResNet can approximate a special function class based on Kolmogrovo Superposition Theorem with $\mathcal{O}(d^4\epsilon^{-1})$ tuning weights to overcome the curse of dimension. This work adds to the theoretical justifications for ResNet's stellar practical performance.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: pdf

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 4958

Loading