Normalisation is dead, long live normalisation!
Keywords: normalization, initialization, propagation, skip connections, residual networks
Abstract: Since the advent of Batch Normalisation (BN) almost every state-of-the-art (SOTA) method uses some form of normalisation.
After all, normalisation generally speeds up learning and leads to models that generalise better than their unnormalised counterparts.
This turns out to be especially useful when using some form of skip connections, which are prominent in Residual Networks (ResNets), for example.
However, Brock et al. (2021a) suggest that SOTA performance can also be achieved using ResNets without normalisation!
Submission Full: zip
Blogpost Url: yml
ICLR Paper: https://openreview.net/forum?id=IX3Nnir2omJ
2 Replies
Loading