If dropout limits trainable depth, does critical initialisation still matter? A large-scale statistical analysis on ReLU networks

Arnu Pretorius, Elan Van Biljon, Benjamin van Niekerk, Ryan Eloff, Matthew Reynard, Steven James, Benjamin Rosman, Herman Kamper, Steve Kroon

Published: 01 Jan 2020, Last Modified: 06 Oct 2025Pattern Recognit. Lett. 2020EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•Recent work has shown that dropout limits the depth to which information can propagate through a neural network.•We investigate the effect of initialisation on training speed and generalisation within this depth limit.•We ask specifically: if dropout limits depth, does initialising critically still matter?•We conduct a large-scale controlled experiment and perform a statistical analysis of over 12 000 trained networks.•We show that at moderate depths, critical initialisation gives no performance gains over off-critical initialisations.

External IDs:dblp:journals/prl/PretoriusBNERJR20