Keywords: Neural Networks, Piecewise Linear Functions, Exact Representations, Polyhedral Geometry, Braid Fan, Boolean Lattice
Abstract: We contribute towards resolving the open question of how many hidden layers are required in ReLU networks for exactly representing all continuous and piecewise linear functions on $\mathbb{R}^d$.
While the question has been resolved in special cases, the best known lower bound in general is still 2.
We focus on neural networks that are compatible with certain polyhedral complexes, more precisely with the braid fan.
For such neural networks, we prove a non-constant lower bound of $\Omega(\log\log d)$ hidden layers required to exactly represent the maximum of $d$ numbers. Additionally, we provide a combinatorial proof that neural networks satisfying this assumption require three hidden layers to compute the maximum of 5 numbers; this had only been verified with an excessive computation so far.
Finally, we show that a natural generalization of the best known upper bound to maxout networks is not tight, by demonstrating that a rank-3 maxout layer followed by a rank-2 maxout layer is sufficient to represent the maximum of 7 numbers.
Supplementary Material: gz
Primary Area: Theory (e.g., control theory, learning theory, algorithmic game theory)
Submission Number: 13310
Loading