When Are Bias-Free ReLU Networks Effectively Linear Networks?

Yedi Zhang; Andrew M Saxe; Peter E. Latham

When Are Bias-Free ReLU Networks Effectively Linear Networks?

Yedi Zhang, Andrew M Saxe, Peter E. Latham

Published: 24 Apr 2025, Last Modified: 24 Apr 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: We investigate the implications of removing bias in ReLU networks regarding their expressivity and learning dynamics. We first show that two-layer bias-free ReLU networks have limited expressivity: the only odd function two-layer bias-free ReLU networks can express is a linear one. We then show that, under symmetry conditions on the data, these networks have the same learning dynamics as linear networks. This enables us to give analytical time-course solutions to certain two-layer bias-free (leaky) ReLU networks outside the lazy learning regime. While deep bias-free ReLU networks are more expressive than their two-layer counterparts, they still share a number of similarities with deep linear networks. These similarities enable us to leverage insights from linear networks to understand certain ReLU networks. Overall, our results show that some properties previously established for bias-free ReLU networks arise due to equivalence to linear networks.

Submission Length: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Bruno_Loureiro1

Submission Number: 4082

Loading