Rethink Depth Separation with Intra-layer Links

20 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: learning theory
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Deep learning theory, depth separation, width, intra-layer links
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: The depth separation theory indicates that depth is significantly more powerful than width, which consists of two parts: i) there exists a function representable by a deep network; ii) such a function cannot be represented by a shallow network whose width is lower than a large threshold. However, the depth-width comparison therein is always based on the standard fully-connected networks, which motivates us to consider the question: Is width always significantly weaker than depth? Here, we report through bound estimation, explicit construction, and functional space analysis that adding shortcuts to connect neurons within a layer can greatly empower the width, such that a slender and shallow network can represent a deep network. Specifically, the width needed can be \textit{exponentially} reduced by intra-layer links to represent the renowned “sawtooth" functions, compared to the threshold prescribed earlier. This means that width can also be powerful when armed with intra-layer links. Because the sawtooth function is a fundamental module in approximating polynomials and smooth functions, our saving of width is general for broader classes of functions. Lastly, the mechanism we identify can be translated into analyzing the expressivity of popular shortcut networks such as ResNet and DenseNet. We demonstrate that the addition of intra-layer links can also empower a ResNet to generate more linear pieces.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 2609
Loading