Keywords: expressive power, depth, exact representations, ReLU networks, mixed volumes, lattice polytopes, number theory
TL;DR: $\max \{0,x_1,\ldots,x_n\}$ requires depth $\Omega(\log n)$ to be exactly described with ReLUs and weights being decimal fractions.
Abstract: To confirm that the expressive power of ReLU neural networks grows with their depth, the function $F_n = \max (0,x_1,\ldots,x_n )$ has been considered in the literature.
A conjecture by Hertrich, Basu, Di Summa, and Skutella [NeurIPS 2021] states that any ReLU network that exactly represents $F_n$ has at least $\lceil \log_2 (n+1) \rceil$ hidden layers.
The conjecture has recently been confirmed for networks with integer weights by Haase, Hertrich, and Loho [ICLR 2023].
We follow up on this line of research and show that, within ReLU networks whose weights are decimal fractions, $F_n$ can only be represented by networks with at least $\lceil \log_3 (n+1) \rceil$ hidden layers.
Moreover, if all weights are $N$-ary fractions, then $F_n$ can only be represented by networks with at least $\Omega( \frac{\ln n}{\ln \ln N})$ layers.
These results are a partial confirmation of the above conjecture for rational ReLU networks, and provide the first non-constant lower bound on the depth of practically relevant ReLU networks.
Supplementary Material: pdf
Primary Area: learning theory
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 10347
Loading