If dropout limits trainable depth, does critical initialisation still matter? A large-scale statistical analysis on ReLU networks
Abstract: Highlights•Recent work has shown that dropout limits the depth to which information can propagate through a neural network.•We investigate the effect of initialisation on training speed and generalisation within this depth limit.•We ask specifically: if dropout limits depth, does initialising critically still matter?•We conduct a large-scale controlled experiment and perform a statistical analysis of over 12 000 trained networks.•We show that at moderate depths, critical initialisation gives no performance gains over off-critical initialisations.
Loading